我正在try 从Survivor维基页面上的三个特定表中提取数据.主要是常用表、赛季总结表和投票历史表.我可以让它在参赛者表上工作得很好,但它告诉我它找不到用于赛季总结或投票历史表的表.我的最终目标是将它们合并到一个数据帧中,以进行清理和操作.

我的代码适用于参赛者表,但不适用于其他表,如下所示:

import pandas as pd

list_of_seasons = ['41', '42', '43', '44', '45', '46']
season_start = 41
contestants = {}
season_summary = {}
voting_history = {}

for i in list_of_seasons :
    contestants[i] = pd.read_html('https://en.wikipedia.org/wiki/Survivor_' + str(season_start), match='contestants')
    season_summary[i] = pd.read_html('https://en.wikipedia.org/wiki/Survivor_' + str(season_start), match='season summary')
    voting_history[i] = pd.read_html('https://en.wikipedia.org/wiki/Survivor_' + str(season_start), match='voting history')
    season_start = season_start + 1

print(contestants['45'])
print(season_summary['45'])
print(voting_history['45'])

我得到的错误是:

Traceback (most recent call last):
  File "c:\Users\bsjes\Documents\Code\Personal Projects\Survivor Data Grabber\SurvivorWikiRipper_0.2.py", line 13, in <module>
    season_summary[i] = pd.read_html('https://en.wikipedia.org/wiki/Survivor_' + str(season_start), match='season summary')        
                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^        
  File "C:\Users\bsjes\AppData\Local\Programs\Python\Python312\Lib\site-packages\pandas\io\html.py", line 1246, in read_html       
    return _parse(
           ^^^^^^^
  File "C:\Users\bsjes\AppData\Local\Programs\Python\Python312\Lib\site-packages\pandas\io\html.py", line 1009, in _parse
    raise retained
  File "C:\Users\bsjes\AppData\Local\Programs\Python\Python312\Lib\site-packages\pandas\io\html.py", line 989, in _parse
    tables = p.parse_tables()
             ^^^^^^^^^^^^^^^^
  File "C:\Users\bsjes\AppData\Local\Programs\Python\Python312\Lib\site-packages\pandas\io\html.py", line 249, in parse_tables     
    tables = self._parse_tables(self._build_doc(), self.match, self.attrs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\bsjes\AppData\Local\Programs\Python\Python312\Lib\site-packages\pandas\io\html.py", line 622, in _parse_tables    
    raise ValueError(f"No tables found matching pattern {repr(match.pattern)}")
ValueError: No tables found matching pattern 'season summary'

我应该做些什么不同的事情?我需要学习一个不同的套餐吗?

推荐答案

从维基页面上看,这些表格位于相同的索引上(例如,"参赛者"表格排在第二位<table>,赛季总结排在第三位,等等).

您可以try :

import pandas as pd

contestants = {}
season_summary = {}
voting_history = {}

for season_start in range(41, 47):
    u = f"https://en.wikipedia.org/wiki/Survivor_{season_start}"

    tables = pd.read_html(u)
    contestants[season_start] = tables[1]
    season_summary[season_start] = tables[2]
    voting_history[season_start] = tables[4]

print(contestants[45])
print(season_summary[45])
print(voting_history[45])

打印:

                       Contestant Age                          From    Tribe                                                      Finish        
                       Contestant Age                          From Original Switched     None    Merged                       Placement     Day
0                     Hannah Rose  33           Baltimore, Maryland     Lulu      NaN      NaN       NaN                   1st voted out   Day 3
1                  Brandon Donlon  26      Sicklerville, New Jersey     Lulu      NaN      NaN       NaN                   2nd voted out   Day 5
2               Sabiyah Broderick  28  Jacksonville, North Carolina     Lulu      NaN      NaN       NaN                   3rd voted out   Day 7
3                    Sean Edwards  35                   Provo, Utah     Lulu     Reba      NaN       NaN                   4th voted out   Day 9
4          Brandon "Brando" Meyer  23           Seattle, Washington     Belo     Belo      NaN       NaN                   5th voted out  Day 11
5   Janani "J. Maya" Krishnan-Jha  24       Los Angeles, California     Reba     Reba  None[a]       NaN                   6th voted out  Day 13
6           Nicholas "Sifu" Alsup  30            O'Fallon, Illinois     Reba     Reba  None[a]  Dakuwaqa                   7th voted out  Day 14
7                 Kaleb Gebrewold  29   Vancouver, British Columbia     Lulu     Lulu  None[a]  Dakuwaqa   8th voted out 1st jury member  Day 14
8               Kellie Nalbandian  29       New York City, New York     Belo     Lulu  None[a]  Dakuwaqa   9th voted out 2nd jury member  Day 16
9                Kendra McQuarrie  30   Steamboat Springs, Colorado     Belo     Belo  None[a]  Dakuwaqa  10th voted out 3rd jury member  Day 17
10    Bruce Perreault Survivor 44  47         Warwick, Rhode Island     Belo     Lulu  None[a]  Dakuwaqa  11th voted out 4th jury member  Day 19
11                  Emily Flippen  28              Laurel, Maryland     Lulu     Belo  None[a]  Dakuwaqa  12th voted out 5th jury member  Day 21
12                    Drew Basile  23    Philadelphia, Pennsylvania     Reba     Belo  None[a]  Dakuwaqa  13th voted out 6th jury member  Day 23
13                    Julie Alley  49          Brentwood, Tennessee     Reba     Reba  None[a]  Dakuwaqa  14th voted out 7th jury member  Day 24
14                  Katurah Topps  35            Brooklyn, New York     Belo     Lulu  None[a]  Dakuwaqa      Eliminated 8th jury member  Day 25
15                    Jake O'Kane  26         Boston, Massachusetts     Belo     Lulu  None[a]  Dakuwaqa                   2nd runner-up  Day 26
16                 Austin Li Coon  26             Chicago, Illinois     Reba     Belo  None[a]  Dakuwaqa                       Runner-up  Day 26
17                 Dee Valladares  26                Miami, Florida     Reba     Reba  None[a]  Dakuwaqa                   Sole Survivor  Day 26
   Episode                                                                                                        Challenge winner(s)                                                                    Eliminated         
       No.                               Title            Air date                                                             Reward                                                           Immunity      Tribe   Player
0        1             "We Can Do Hard Things"  September 27, 2023                                                               Reba                                                               Belo       Lulu   Hannah
1        1             "We Can Do Hard Things"  September 27, 2023                                                               Reba                                                               Reba       Lulu   Hannah
2        2  "Brought a Bazooka to a Tea Party"     October 4, 2023                                                               Reba                                                               Reba       Lulu  Brandon
3        2  "Brought a Bazooka to a Tea Party"     October 4, 2023                                                               Belo                                                               Belo       Lulu  Brandon
4        3                "No Man Left Behind"    October 11, 2023                                                               Lulu                                                               Reba       Lulu  Sabiyah
5        3                "No Man Left Behind"    October 11, 2023                                                               Reba                                                               Belo       Lulu  Sabiyah
6        4                  "Music to My Ears"    October 18, 2023                                                                NaN                                                               Lulu       Reba     Sean
7        4                  "Music to My Ears"    October 18, 2023                                                                NaN                                                               Belo       Reba     Sean
8        5       "I Don't Want to Be the Worm"    October 25, 2023                                                               Reba                                                               Reba       Belo   Brando
9        5       "I Don't Want to Be the Worm"    October 25, 2023                                                               Lulu                                                               Lulu       Belo   Brando
10       6  "I'm Not Batman, I'm the Canadian"    November 1, 2023  Austin, Bruce, Drew, Julie, Kendra, Sifu [Katurah] (Blue Team)[a]  Austin, Bruce, Drew, Julie, Kendra, Sifu [Katurah] (Blue Team)[a]        NaN  J. Maya
11       7             "The Thorn in My Thumb"    November 8, 2023            Dee [Austin, Jake, Julie, Kaleb, Katurah] (Red Team)[b]                                                 Kellie (Blue Team)   Dakuwaqa     Sifu
12       7             "The Thorn in My Thumb"    November 8, 2023            Dee [Austin, Jake, Julie, Kaleb, Katurah] (Red Team)[b]                                                     Dee (Red Team)   Dakuwaqa    Kaleb
13       8   "Following a Dead Horse to Water"   November 15, 2023                                                   Survivor Auction                                                              Bruce   Dakuwaqa   Kellie
14       9                 "Sword of Damocles"   November 22, 2023                                 Bruce, Julie, Kendra (Yellow Team)                                                              Bruce   Dakuwaqa   Kendra
15      10             "How Am I the Mobster?"   November 29, 2023                                        Emily [Dee, Julie, Katurah]                                                             Austin   Dakuwaqa    Bruce
16      11     "This Game Rips Your Heart Out"    December 6, 2023                                             Drew [Austin, Jake][c]                                                               Drew   Dakuwaqa    Emily
17      12  "The Ex-Girlfriend at the Wedding"   December 13, 2023                                              Austin [Dee, Katurah]                                                                Dee   Dakuwaqa     Drew
18      13         "Living the Survivor Dream"   December 20, 2023                                                   Austin [Jake][d]                                                             Austin   Dakuwaqa    Julie
19      13         "Living the Survivor Dream"   December 20, 2023                                                                NaN                                                       Dee [Austin]   Dakuwaqa  Katurah
   Unnamed: 0_level_0 Original tribes                   Switched tribes            No tribes          Merged tribe                                                                                     Unnamed: 17_level_0
              Episode               1        2        3               4          5         6      6.1            7       7.1         8          9        10        11        12          13       13.1 Unnamed: 17_level_1
0                 Day               3        5        7               9         11        13       13        14[a]     14[a]        16         17        19        21        23          24         25                 NaN
1               Tribe            Lulu     Lulu     Lulu            Reba       Belo       NaN      NaN     Dakuwaqa  Dakuwaqa  Dakuwaqa   Dakuwaqa  Dakuwaqa  Dakuwaqa  Dakuwaqa    Dakuwaqa   Dakuwaqa                 NaN
2          Eliminated          Hannah  Brandon  Sabiyah            Sean     Brando       NaN  J. Maya         Sifu     Kaleb    Kellie     Kendra     Bruce     Emily      Drew       Julie    Katurah                 NaN
3               Votes          5–0[b]      3–0      2–1           3–1–1        3–2    0–0[c]     10–1          5–1       4–2       5–3        6–1     4–3–1    1–0[d]       4–2  2–1–1–0[e]    None[f]                 NaN
4               Voter            Vote     Vote     Vote            Vote       Vote      Vote     Vote         Vote      Vote      Vote       Vote      Vote      Vote      Vote        Vote  Challenge                 NaN
5                 Dee             NaN      NaN      NaN            Sifu        NaN     Kaleb  J. Maya          NaN     Kaleb    Kellie     Kendra      Jake     Julie      Drew     Katurah  Immune[f]                 NaN
6              Austin             NaN      NaN      NaN             NaN  Brando[g]   None[h]  None[h]          NaN     Kaleb    Kellie  Kendra[i]      Jake     Julie     Julie       Julie   Saved[f]                 NaN
7                Jake             NaN      NaN      NaN             NaN        NaN     Kaleb  J. Maya          NaN     Julie   None[j]     Kendra     Bruce     Julie      Drew         Dee     Won[f]                 NaN
8             Katurah             NaN      NaN      NaN             NaN        NaN     Kaleb  J. Maya          NaN     Kaleb      Jake    None[i]     Bruce     Julie      Drew       Julie    Lost[f]                 NaN
9               Julie             NaN      NaN      NaN            Sean        NaN     Kaleb  J. Maya          NaN     Kaleb    Kellie     Kendra     Bruce     Emily      Drew        Jake        NaN                 NaN
10               Drew             NaN      NaN      NaN             NaN     Brando     Kaleb  J. Maya         Sifu       NaN    Kellie     Kendra      Jake     Julie     Julie         NaN        NaN                 NaN
11              Emily          Hannah  Brandon  Sabiyah             NaN     Brando     Kaleb  J. Maya         Sifu       NaN    Kellie    None[i]     Bruce     Julie       NaN         NaN        NaN                 NaN
12              Bruce             NaN      NaN      NaN             NaN        NaN     Kaleb  J. Maya         Sifu       NaN   None[k]     Kendra     Julie       NaN       NaN         NaN        NaN                 NaN
13             Kendra             NaN      NaN      NaN             NaN       Drew     Kaleb  J. Maya         Sifu       NaN      Jake       Jake       NaN       NaN       NaN         NaN        NaN                 NaN
14             Kellie             NaN      NaN      NaN             NaN        NaN     Kaleb  J. Maya         Sifu       NaN      Jake        NaN       NaN       NaN       NaN         NaN        NaN                 NaN
15              Kaleb          Hannah  Brandon  Sabiyah             NaN        NaN   None[j]  None[j]          NaN     Julie       NaN        NaN       NaN       NaN       NaN         NaN        NaN                 NaN
16               Sifu             NaN      NaN      NaN            Sean        NaN     Kaleb  J. Maya        Bruce       NaN       NaN        NaN       NaN       NaN       NaN         NaN        NaN                 NaN
17            J. Maya             NaN      NaN      NaN            Sean        NaN     Kaleb    Emily          NaN       NaN       NaN        NaN       NaN       NaN       NaN         NaN        NaN                 NaN
18             Brando             NaN      NaN      NaN             NaN       Drew       NaN      NaN          NaN       NaN       NaN        NaN       NaN       NaN       NaN         NaN        NaN                 NaN
19               Sean          Hannah  Brandon    Kaleb             Dee        NaN       NaN      NaN          NaN       NaN       NaN        NaN       NaN       NaN       NaN         NaN        NaN                 NaN
20            Sabiyah          Hannah  None[l]  None[h]             NaN        NaN       NaN      NaN          NaN       NaN       NaN        NaN       NaN       NaN       NaN         NaN        NaN                 NaN
21            Brandon          Hannah  None[l]      NaN             NaN        NaN       NaN      NaN          NaN       NaN       NaN        NaN       NaN       NaN       NaN         NaN        NaN                 NaN
22             Hannah         None[b]      NaN      NaN             NaN        NaN       NaN      NaN          NaN       NaN       NaN        NaN       NaN       NaN       NaN         NaN        NaN                 NaN

Python相关问答推荐

try 与gemini-pro进行多轮聊天时出错

滚动和,句号来自Pandas列

将JSON对象转换为Dataframe

调用decorator返回原始函数的输出

启用/禁用shiny 的自动重新加载

如何指定列数据类型

在两极中过滤

如何在TensorFlow中分类多个类

try 检索blob名称列表时出现错误填充错误""

如何杀死一个进程,我的Python可执行文件以sudo启动?

如何使用大量常量优化代码?

如何获得满足掩码条件的第一行的索引?

为什么后跟inplace方法的`.rename(Columns={';b';:';b';},Copy=False)`没有更新原始数据帧?

以极轴表示的行数表达式?

使用xlsxWriter在EXCEL中为数据帧的各行上色

VSCode Pylance假阳性(?)对ImportError的react

无法使用请求模块从网页上抓取一些产品的名称

如何计算Pandas 中具有特定条件的行之间的天差

在不降低分辨率的情况下绘制一组数据点的最外轮廓

使用Django标签显示信息