我正在try 创建一个函数,用于为请求的频率对给定的数据帧进行归一化.
code:个
import numpy as np
import pandas as pd
def timeseries_dataframe_normalized(df, normalization_freq = 'complete'):
"""
Input:
df : dataframe
input dataframe
normalization_freq : string
'daily', 'weekly', 'monthly','quarterly','yearly','complete' (default)
Return: normalized dataframe
"""
# auxiliary dataframe
adf = df.copy()
# convert columns to float
# Ref: https://stackoverflow.com/questions/15891038/change-column-type-in-pandas
adf = adf.astype(float)
# normalized columns
nor_cols = adf.columns
# add suffix to columns and create new names for maximum columns
max_cols = adf.add_suffix('_max').columns
# initialize maximum columns
adf.loc[:,max_cols] = np.nan
# check the requested frequency
if normalization_freq =='complete':
adf[max_cols] = adf[nor_cols].max()
# compute and return the normalized dataframe
print(adf[nor_cols])
print(adf[max_cols])
adf[nor_cols] = adf[nor_cols]/adf[max_cols]
# return the normalized dataframe
return adf[nor_cols]
# Example
df2 = pd.DataFrame(data={'A':[20,10,30],'B':[1,2,3]})
timeseries_dataframe_normalized(df2)
Expected output:个
df2 =
A B
0 0.666667 0.333333
1 0.333333 0.666667
2 1.000000 1.000000
Present output:个
我很惊讶得到了下面的错误.然而,当我计算df2/df2.max()
时,我得到了预期的输出,但这个函数给了我错误的结果.
ValueError: Columns must be same length as key