动态创建变量is not a good idea,但您可以轻松地利用字典等可变对象.
添加新的DataFrame方法以无缝完成此操作:
from pandas.core.base import PandasObject
### this only needs to be done once per session
def to_name(df, dic, name, copy=False):
dic[name] = df.copy() if copy else df
return df
PandasObject.to_name = to_name
###
tmp = {}
df = (pd.DataFrame([[2, 4, 6],
[8, 10, 12],
[14, 16, 18],
])
.assign(something_else=100)
.div(2)
.to_name(tmp, 'after_div2', copy=True)
.div(10)
)
print(tmp['after_div2'])
print(df)
输出:
# tmp['after_div2']
0 1 2 something_else
0 1.0 2.0 3.0 50.0
1 4.0 5.0 6.0 50.0
2 7.0 8.0 9.0 50.0
# df
0 1 2 something_else
0 0.1 0.2 0.3 5.0
1 0.4 0.5 0.6 5.0
2 0.7 0.8 0.9 5.0
如果不想用猴子修补DataFrame对象,请使用pipe
:
def to_name(df, dic, name, copy=False):
dic[name] = df.copy() if copy else df
return df
tmp = {}
df = (pd.DataFrame([[2, 4, 6],
[8, 10, 12],
[14, 16, 18],
])
.assign(something_else=100)
.div(2)
.pipe(to_name, tmp, 'after_div2')
.div(10)
.pipe(lambda df: print('\nQuick alternative:', df, sep='\n') or df)
)
print(tmp['after_div2'])
印刷
在同一行中,您还可以添加一个可链接的print
方法,或再次使用pipe
中的lambda:
from pandas.core.base import PandasObject
### this only needs to be done once per session
def df_print(df, *args):
if args:
print(*args)
print(df)
return df
PandasObject.print = df_print
###
df = (pd.DataFrame([[2, 4, 6],
[8, 10, 12],
[14, 16, 18],
])
.print()
.assign(something_else=100)
.div(2)
.print('\nAfter 2:')
.div(10)
.pipe(lambda df: print('\nQuick alternative:', df, sep='\n') or df)
)
输出:
0 1 2
0 2 4 6
1 8 10 12
2 14 16 18
After 2:
0 1 2 something_else
0 1.0 2.0 3.0 50.0
1 4.0 5.0 6.0 50.0
2 7.0 8.0 9.0 50.0
Quick alternative:
0 1 2 something_else
0 0.1 0.2 0.3 5.0
1 0.4 0.5 0.6 5.0
2 0.7 0.8 0.9 5.0
作为一个模块
您还可以:
pandas_debug.py
个
from pandas.core.base import PandasObject
def df_print(df, *args):
if args:
print(*args)
print(df)
return df
PandasObject.print = df_print
def to_name(df, dic, name, copy=False):
dic[name] = df.copy() if copy else df
return df
PandasObject.to_name = to_name
然后在您的代码中:
import pandas as pd
import pandas_debug
tmp = {}
df = (pd.DataFrame([[2, 4, 6],
[8, 10, 12],
[14, 16, 18],
])
.assign(something_else=100)
.div(2)
.to_name(tmp, 'after_div2')
.div(10)
.print()
)