Running pandas 1.5.3. Also attempted on pandas 2.2.1.
我正在从CSV加载数据,如下所示:
888|0|TEST ACCOUNT
888|1|Sample Ship-to
802001|0|COMPANY 1
802001|1|COMPANY 1 INC
802001|2|COMPANY 1 BALL
K802001|3|COMPANY 1
列CUSNO
、S2
和NAME
,按该顺序排列.
我有一个脚本,它加载数据,然后判断第一列,并确保它在结果DataFrame中为int64
.如果没有,脚本应该将列转换为数字,并删除其中包含NaN的行.
所以,之前:
CUSNO S2 NAME
0 888 0 TEST ACCOUNT
1 888 1 Sample Ship-to
2 802001 0 COMPANY 1
3 802001 1 COMPANY 1 INC
4 802001 2 COMPANY 1 BALL
5 K802001 3 COMPANY 1
然后运行:
cl['CUSNO'] = pd.to_numeric(cl.CUSNO, errors='coerce')
cl = cl.dropna(axis='index', how='any')
之后:
CUSNO S2 NAME
0 888.0 0 TEST ACCOUNT
1 888.0 1 Sample Ship-to
2 802001.0 0 COMPANY 1
3 802001.0 1 COMPANY 1 INC
4 802001.0 2 COMPANY 1 BALL
我想让CUSNO
成为一列满int64
或类似类型,但当我运行company_locations['CUSNO'].dtype
时,它会继续返回float64
.(实际上,我想go 掉CUSNO
中每个条目末尾的小数点,我认为排版到int
或类似的方式效果最好.
我try 了一些解决方案,即:
cl['CUSNO'] = pd.to_numeric(cl.CUSNO, errors='coerce').dropna().astype(int) # replacing the earlier line 1 of the script
cl['CUSNO'] = cl.astype({'CUSNO': 'int'})
cl['CUSNO'] = cl['CUSNO'].apply(pd.to_numeric, errors='coerce')
对于上面脚本中的第二行,我已经try 了inplace=True
.我还try 了pandas: to_numeric for multiple columns、Change column type in pandas和Python - pandas column type casting with "astype" is not working的解决方案.
也许我漏掉了什么?我是否必须将新的DataFrame复制到一个新的变量或其他东西?