我想读一列,每行的第一个单词是调查的季度和年份,以及调查的名称.起初,我试图重命名调查名称,在该名称中,我在整个列中保持季度和年度不变,但如果我针对其他季度的文件运行此脚本,则不会检测到整行内容,我的脚本也不会工作.
我的例子:
Survey Name
0 Q321 Your Voice - Information Tech
1 Q321 Your Voice - Information Tech
2 Q321 Your Voice - Information Tech
3 Q321 Your Voice - Information Tech
4 Q321 Your Voice - Information Tech
9630 Q321 Your Voice - Business Group
9631 Q321 Your Voice - Business Group
(Q321=2021第3季度)
我的代码将其转换为:
Survey Name
0 Q321 YV - IT
1 Q321 YV - IT
2 Q321 YV - IT
3 Q321 YV - IT
4 Q321 YV - IT
9630 Q321 YV - BG
9631 Q321 YV - BG
我使用的代码:
print(df.loc[:, "Survey.Name"])
'isolate to column of interest and replace commonly incorrect string with the correct output'
df.loc[df['Survey.Name'].str.contains('Q321 Your Voice - Information Tech'), 'Survey.Name'] = \
'Q321 YV - IT'
df.loc[df['Survey.Name'].str.contains('Q321 Your Voice - Business Group'), 'Survey.Name'] = \
'Q321 YV - BG'
df.loc[df['Survey.Name'].str.contains('Q321 Your Voice - Study Group'), 'Survey.Name'] = \
'Q321 YV - SG'
print(df.loc[:, "Survey.Name"])
但假设我在另一个季度(比如2021第4季度)的文件上运行此脚本:
Survey Name
0 Q421 Your Voice - Information Tech
1 Q421 Your Voice - Information Tech
2 Q421 Your Voice - Information Tech
3 Q421 Your Voice - Information Tech
4 Q421 Your Voice - Information Tech
9630 Q421 Your Voice - Business Group
9631 Q421 Your Voice - Business Group
每次使用新季度时,我都必须更改脚本.我有没有办法"检测"第一个单词(幸运的是恰好是调查的季度和年份),并将其包含在转换后的版本中,同时替换该列中需要更改的字符串?