我试图通过对现有列执行一些操作来创建新列,但这在代码中引发了一个键错误.我try 通过使用df.Columns和Copy粘贴完全相同的名称来对其进行调试,但仍然收到相同的错误.我的代码如下:

def calculate_elasticity(group):
    sales_change = group['Primary Sales Quantity'].pct_change()
    price_change = group['MRP'].pct_change()
    
    elasticity = sales_change / price_change
    
    return elasticity

df['Variant-based Elasticity'] = df.groupby('Variant').transform(calculate_elasticity)

显示的错误是

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
/usr/local/lib/python3.10/dist-packages/pandas/core/indexes/base.py in get_loc(self, key, method, tolerance)
   3801             try:
-> 3802                 return self._engine.get_loc(casted_key)
   3803             except KeyError as err:

16 frames
pandas/_libs/index_class_helper.pxi in pandas._libs.index.Int64Engine._check_type()

KeyError: 'Primary Sales Quantity'

The above exception was the direct cause of the following exception:

KeyError                                  Traceback (most recent call last)
/usr/local/lib/python3.10/dist-packages/pandas/core/indexes/base.py in get_loc(self, key, method, tolerance)
   3802                 return self._engine.get_loc(casted_key)
   3803             except KeyError as err:
-> 3804                 raise KeyError(key) from err
   3805             except TypeError:
   3806                 # If we have a listlike key, _check_indexing_error will raise

KeyError: 'Primary Sales Quantity'

我try 进行调试,下面是df.Columns的结果 Index(['Cal. year / month', 'Material', 'Product Name', 'MRP', 'Distribution Channel (Master)', 'Unnamed: 5', 'L1 Prod Category', 'L2 Prod Brand', 'L3 Prod Sub-Category', 'State', 'Primary Actual GSV Value', 'Primary Sales Qty (CS)', 'Secondary GSV', 'Secondary sales Qty(CS)', 'Primary Volume(MT/KL)', 'Secondary Volume(MT/KL)', 'Variant', 'Weight', 'Offers', 'Primary Sales Quantity'], dtype='object')

并且结果为print(df['Primary Sales Quantity'])

0          155
1        16953
2          455
3          138
4         2653
         ...  
14147        6
14148        1
14149     8428
14150      237
14151       24
Name: Primary Sales Quantity, Length: 14152, dtype: int64

我try 使用列名进行调试.我甚至可以使用该名称访问该列,只是在此函数中抛出了错误.

推荐答案

如果使用GroupBy.transform不能同时处理2列,则需要GroupBy.apply:

def calculate_elasticity(group):
    sales_change = group['Primary Sales Quantity'].pct_change()
    price_change = group['MRP'].pct_change()
    
    group['Variant-based Elasticity'] = sales_change / price_change
    return group

df = df.groupby('Variant', group_keys=False).apply(calculate_elasticity)
print (df)
  Variant  Primary Sales Quantity  MRP  Variant-based Elasticity
0       a                      10    8                       NaN
1       a                       7   10                 -1.200000
2       b                      87    3                       NaN
3       b                       8    2                  2.724138

或者在没有帮助器功能的情况下更改解决方案:

g = df.groupby('Variant')
df['Variant-based Elasticity'] = (g['Primary Sales Quantity'].pct_change() /
                                  g['MRP'].pct_change())
print (df)
  Variant  Primary Sales Quantity  MRP  Variant-based Elasticity
0       a                      10    8                       NaN
1       a                       7   10                 -1.200000
2       b                      87    3                       NaN
3       b                       8    2                  2.724138

Helper df1 DataFrame的替代解决方案:

df1 = df.groupby('Variant')[['Primary Sales Quantity', 'MRP']].pct_change()
df['Variant-based Elasticity'] = df1['Primary Sales Quantity'] / df1['MRP']
print (df)
  Variant  Primary Sales Quantity  MRP  Variant-based Elasticity
0       a                      10    8                       NaN
1       a                       7   10                 -1.200000
2       b                      87    3                       NaN
3       b                       8    2                  2.724138

样本数据:

df = pd.DataFrame({'Variant': ['a', 'a', 'b', 'b'], 
                   'Primary Sales Quantity': [10, 7, 87, 8], 
                   'MRP': [8, 10, 3, 2]})

Python相关问答推荐

2维数组9x9,不使用numpy.数组(MutableSequence的子类)

即使在可见的情况下也不相互作用

Pandas 在最近的日期合并,考虑到破产

仿制药的类型铸造

将jit与numpy linSpace函数一起使用时出错

Python中的嵌套Ruby哈希

管道冻结和管道卸载

多处理队列在与Forking http.server一起使用时随机跳过项目

名为__main__. py的Python模块在导入时不运行'

使用Python从rotowire中抓取MLB每日阵容

在极中解析带有数字和SI前缀的字符串

如何在BeautifulSoup/CSS Select 器中处理regex?

PYTHON、VLC、RTSP.屏幕截图不起作用

使用SQLAlchemy从多线程Python应用程序在postgr中插入多行的最佳方法是什么?'

在pandas中,如何在由两列加上一个值列组成的枢轴期间或之后可靠地设置多级列的索引顺序,

上传文件并使用Panda打开时的Flask 问题

解析CSV文件以将详细信息添加到XML文件

为什么这个正则表达式没有捕获最后一次输入?

如何有效地计算所有输出相对于参数的梯度?

如何将ManyToManyfield用于Self类