我已经提取了一堆幸存者赛季的数据,并将参赛者表分类为一个数据框架.我已经添加了一个专栏来确定选手在哪个赛季比赛.我想将索引设置为季节,最终按季节排序,但当我将索引设置为季节时,所有值都在NaN处显示.不知道我做错了什么.我应该参考哪些文件?. set_index文档对我来说并不是很开放,我是一个初学者,可能超出了我的深度.
当前代码:
import pandas as pd
import numpy as py
contestants = {}
for season in range(11, 47) :
wiki_url = f'https://en.wikipedia.org/wiki/Survivor_{season}'
tables = pd.read_html(wiki_url)
contestants[season] = tables[1]
contestants[season]['Season'] = season
contestants_joined = pd.concat(contestants.values()).set_index('Season', drop=False, inplace=False)
print(contestants_joined)
然后我就得到了
(Contestant, Contestant) (Age, Age) (From, From) \
Season
NaN Jim Lynch 63 Northglenn, Colorado
NaN Morgan McDevitt 21 Decatur, Illinois
NaN Brianna Varela 21 Edmonds, Washington
NaN Brooke Struck 25 Hood River, Oregon
NaN Blake Towsley 24 Dallas, Texas
... ... ... ...
NaN Soda Thompson 27 Lake Hopatcong, New Jersey
NaN Tevin Davis 24 Richmond, Virginia
NaN Tiffany Nicole Ervin 33 Elizabeth, New Jersey
NaN Tim Spicer 31 Atlanta, Georgia
NaN Venus Vafa 24 Toronto, Ontario
(Tribe, Original) (Tribe, Switched) (Tribe, Merged) \
Season
NaN Nakúm NaN NaN
NaN Yaxhá NaN NaN
NaN Yaxhá NaN NaN
NaN Nakúm Nakúm NaN
NaN Nakúm Yaxhá NaN
... ... ... ...
NaN Nami NaN NaN
NaN Nami NaN NaN
NaN Yanu NaN NaN
...
NaN NaN NaN
NaN NaN NaN
其中,它确实将索引设置为‘季节’列,但所有值都是NaN.
如有任何帮助,我们不胜感激.