我已经构建了一个函数来创建一个包含一列或两列的数据框,具体取决于输入.输入数据框具有从Excel导入的日期时间列.在导入过程中,我使用以下代码将该列设置为日期时间:


    df['Date Raised'] = pd.to_datetime(df['Date Raised'],
                                           dayfirst=True,
                                           format='"%Y-%m-%d"')

然后,我将数据帧送入此函数:


    def data_award_by_grade(df, x=None):
        df = df.copy()
        df = df[["Amount Awarded", "Nominee Grade", "Date Raised"]]
    
        target_date = pd.Timestamp("2023-04-01")
        after_target_date = df[df['Date Raised'] > target_date] 
        
        if x is not None and x > 0:
            df_one = after_target_date.groupby(["Nominee Grade"]).mean().astype(int)
            df_one = df_one.rename(columns={'Amount Awarded': 'Total Average'})
            
            x_months_after_date = target_date + pd.DateOffset(days=x * 30)
            df_two = after_target_date[after_target_date['Date Raised'] <= x_months_after_date]
            df_two = df_two.groupby(["Nominee Grade"]).mean().astype(int)
            df_two = df_two.rename(columns={'Amount Awarded': f'Average Across {x} Month(s)'})
            result_df = df_one.add(df_two, fill_value=0).replace(np.nan, 0).astype(int)
            
        else:
            result_df = after_target_date.groupby(["Nominee Grade"]).mean().astype(int)
            result_df = result_df.rename(columns={'Amount Awarded': 'Total Average'})
    
        return result_df
    
    award_by_grade = data_award_by_grade(raw_data, 6)
    
    award_by_grade

每当我运行它时,它都会返回错误:


    TypeError: Converting from datetime64[ns] to int32 is not supported. Do obj.astype('int64').astype(dtype) instead

完全错误:

TypeError                                 Traceback (most recent call last)
<ipython-input-30-462a3025057a> in <module>
     22     return result_df
     23 
---> 24 MQD_award_by_grade = data_award_by_grade(MQD_raw_data, 6)
     25 
     26 MQD_award_by_grade

<ipython-input-30-462a3025057a> in data_award_by_grade(df, x)
      7 
      8     if x is not None and x > 0:
----> 9         df_one = after_target_date.groupby(["Nominee Grade"]).mean().astype(int)
     10         df_one = df_one.rename(columns={'Amount Awarded': 'Total Average'})
     11 

~\Anaconda3\lib\site-packages\pandas\core\generic.py in astype(self, dtype, copy, errors)
   6322         else:
   6323             # else, only a single dtype is given
-> 6324             new_data = self._mgr.astype(dtype=dtype, copy=copy, errors=errors)
   6325             return self._constructor(new_data).__finalize__(self, method="astype")
   6326 

~\Anaconda3\lib\site-packages\pandas\core\internals\managers.py in astype(self, dtype, copy, errors)
    449             copy = False
    450 
--> 451         return self.apply(
    452             "astype",
    453             dtype=dtype,

~\Anaconda3\lib\site-packages\pandas\core\internals\managers.py in apply(self, f, align_keys, **kwargs)
    350                 applied = b.apply(f, **kwargs)
    351             else:
--> 352                 applied = getattr(b, f)(**kwargs)
    353             result_blocks = extend_blocks(applied, result_blocks)
    354 

~\Anaconda3\lib\site-packages\pandas\core\internals\blocks.py in astype(self, dtype, copy, errors, using_cow)
    509         values = self.values
    510 
--> 511         new_values = astype_array_safe(values, dtype, copy=copy, errors=errors)
    512 
    513         new_values = maybe_coerce_values(new_values)

~\Anaconda3\lib\site-packages\pandas\core\dtypes\astype.py in astype_array_safe(values, dtype, copy, errors)
    240 
    241     try:
--> 242         new_values = astype_array(values, dtype, copy=copy)
    243     except (ValueError, TypeError):
    244         # e.g. _astype_nansafe can fail on object-dtype of strings

~\Anaconda3\lib\site-packages\pandas\core\dtypes\astype.py in astype_array(values, dtype, copy)
    182     if not isinstance(values, np.ndarray):
    183         # i.e. ExtensionArray
--> 184         values = values.astype(dtype, copy=copy)
    185 
    186     else:

~\Anaconda3\lib\site-packages\pandas\core\arrays\datetimes.py in astype(self, dtype, copy)
    699         elif is_period_dtype(dtype):
    700             return self.to_period(freq=dtype.freq)
--> 701         return dtl.DatetimeLikeArrayMixin.astype(self, dtype, copy)
    702 
    703     # -----------------------------------------------------------------

~\Anaconda3\lib\site-packages\pandas\core\arrays\datetimelike.py in astype(self, dtype, copy)
    470             values = self.asi8
    471             if dtype != np.int64:
--> 472                 raise TypeError(
    473                     f"Converting from {self.dtype} to {dtype} is not supported. "
    474                     "Do obj.astype('int64').astype(dtype) instead"

推荐答案

正如@FObersteiner向您建议的那样,使用.astype('int64')将日期时间列转换为数字.您正在32位平台上工作,这就是引发此异常的原因.

但是,如果在日期时间之后应用pd.DateOffset,那么将日期时间转换为int(或浮点数)意味着什么呢?

您可以try 以下选项之一:

# 1. Convert as int64
>>> df.groupby('Nominee Grade').mean().astype('int64')

# 2. Round the result
>>> df.groupby('Nominee Grade').mean().round().astype(int)

# 3. Round only Amount Awarded values and leave Date Raised untouched
>>> df.groupby('Nominee Grade').mean().astype({'Amount Awarded': int})

Python相关问答推荐

对齐多个叠置多面Seborn CAT图

使用Numpy进行重写For循环矢量化

数字巨 Python S列表与Pandas 数据帧的比较

Polars-to_Dicts().词典目录的顺序有保证吗?

Django-如何通过两个相同的字段进行筛选或排除?

基于列名复制数据帧中的列

蒙特卡罗模拟的Numba实现

使用存储过程和参数的Read_SQL_Query()

当将python文件导出到exe时,它不能正常运行.(找不到包tkdnd)

如何运行一行代码多基于输入?

将结果存储在int32中时,如果从2**31中减go 1而不是从numpy.power(2,31)中减go 1,则出现Numpy溢出警告

在模拟无限WHILE循环下的IF语句暂停整个程序

线条更改 colored颜色 的绘图的自定义图例

将此python.lust语句重写为更新的形式

搜索具有唯一键的列表列表

安装名为';swg';的Faiss-CPU-No模块时出错;

如何加快Python进程之间的数据交换?

如何在matplotlib中强调圆的一部分

是否应该在其他子类继承的类中使用名称重写?

如何将多个行转换为多个索引级别?