我的数据示例
StreetAddress | City | State | Zip |
---|---|---|---|
1 Main St 01123 | Winsted | CT | |
1 Main St | Winsted | CT | 01123 |
我正在try 使用regex和pandas来清理一个Electron 表格,我有. 我遇到的问题是,我的regex代码替换了整个列中的每个单元格,即使其中有有效数据.
我试
df['Zip'] = df['StreetAddress'].str.extract(r'(\d{5})')
df['StreetAddress'] = df['StreetAddress'].str.replace(r'(\d{5})', '', regex=True)
这给了我
StreetAddress | City | State | Zip |
---|---|---|---|
1 Main St | Winsted | CT | 01123 |
1 Main St | Winsted | CT |
我希望能有更像这样的东西
StreetAddress | City | State | Zip |
---|---|---|---|
1 Main St | Winsted | CT | 01123 |
1 Main St | Winsted | CT | 01123 |