I want to select only rows that have fc_id == 2, and then delete those having duplicates. This is my input file
我只是被困在了第一步.之后,我还需要一个输出文件,在其中我将获得fc_id==2且没有重复项的最终数据.
我试过这个:
df = pd.read_csv(r'test.csv')
df2 = df[df["fc_id"]==2]
def condi(df2):
df3[x] = np.where(df(df2)==2, 1, 0)
return x
var = condi(df2)
#print(var)
with open('test.csv', 'r') as in_file, open('out_test.csv', 'w') as out_file:
seen = set()
if var == 1:
for line in in_file:
if line in seen: continue
seen.add(line)
out_file.write(line)
我收到一个错误,当我试图打印(Var)时,它说"‘DataFrame’对象不可调用".