Python pandas DataFrame GroupBy.diff函数的意外输出

发布于04月23日

考虑以下Python代码，它本质上是从pandas '用户指南的Group by: split-apply-combine章中的第一个代码插入复制的.

import pandas as pd
import numpy as np

speeds = pd.DataFrame(
    data = {'class': ['bird', 'bird', 'mammal', 'mammal', 'mammal'],
            'order': ['Falconiformes', 'Psittaciformes', 'Carnivora', 'Primates', 'Carnivora'],
            'max_speed': [389.0, 24.0, 80.2, np.NaN, 58.0]},
    index = ['falcon', 'parrot', 'lion', 'monkey', 'leopard']
)

grouped = speeds.groupby('class')['max_speed']
grouped.diff()

在Google Colab中执行时，输出是:

falcon       NaN
parrot    -365.0
lion         NaN
monkey       NaN
leopard      NaN
Name: max_speed, dtype: float64

这与用户指南中显示的输出相同.

为什么与parrot索引元素-365.0相对应的值与本系列中的其余值一样是NaN？

推荐答案

输出是正确且预期的.为了清楚起见，以下是其所做的事情的详细说明:

falcon       NaN                 # NaN since first of the "bird" group
parrot    -365.0                 # 24 - 389   = -365
lion         NaN                 # NaN since first of the "mammal" group
monkey       NaN                 # NaN - 80.2 = NaN
leopard      NaN                 # 58 - NaN   = NaN
Name: max_speed, dtype: float64

如果将输入中的NaN替换为有效值(例如42)，您将得到:

alcon       NaN                 # NaN since first of the "bird" group
parrot    -365.0                 # 24 - 389   = -365
lion         NaN                 # NaN since first of the "mammal" 
monkey     -38.2                 # 42 - 80.2 = -38.2
leopard     16.0                 # 58 - 38.2 = 16
Name: max_speed, dtype: float64

Python相关问答推荐

在Windows上启动新Python项目的正确步骤顺序

Python pandas DataFrame GroupBy.diff函数的意外输出

推荐答案

Python相关问答推荐

在Windows上启动新Python项目的正确步骤顺序

使用regex分析具有特定字符的字符串(如果它们存在)

将行从一个DF添加到另一个DF

Python -Polars库中的滚动索引？

如何计算列表列行之间的公共元素

Class_weight参数不影响RandomForestClassifier不平衡数据集中的结果

Pystata：从Python并行运行stata实例

类型错误：输入类型不支持ufuncisnan-在执行Mann-Whitney U测试时[SOLVED]

ODE集成中如何终止solve_ivp的无限运行

在Python argparse包中添加formatter_class MetavarTypeHelpFormatter时， - help不再工作""""

SQLAlchemy Like ALL ORM analog

LocaleError：模块keras._' tf_keras. keras没有属性__internal_'''

Gekko中基于时间的间隔约束

如何在海上配对图中使某些标记周围的黑色边框

ModuleNotFoundError：没有模块名为x时try 运行我的代码''

BeautifulSoup：超过24个字符(从a到z)的迭代失败：降低了首次深入了解数据集的复杂性：

在Python中控制列表中的数据步长

按条件添加小计列

使用np.fft.fft2和cv2.dft重现相位谱.为什么结果并不相似呢？

为什么在Python中00是一个有效的整数？