使用 Python 抓取baseballreference.com评分时遇到的问题

发布于08月17日

我有麻烦下面的网页得到球员超链接网络刮，因为它只打印出从菜单在页面底部的球员，而不是列出的球员为相关的盒子得分游戏.我需要改变什么才能得到明尼苏达双胞胎和天使队的球员？

import requests
from bs4 import BeautifulSoup

# URL of the webpage
url = "https://www.baseball-reference.com/boxes/ANA/ANA202305210.shtml"

# Send a GET request to the webpage
response = requests.get(url)

# Check if the request was successful
if response.status_code == 200:
    # Parse the HTML content of the webpage using BeautifulSoup
    soup = BeautifulSoup(response.content, 'html.parser')
    
    # Find all hyperlink elements on the page with "/players/" in the href attribute
    links = soup.find_all('a', href=lambda href: href and '/players/' in href)
    
    # Extract and print the href attribute of each matching hyperlink
    for link in links:
        href = link.get('href')
        print(href)
else:
    print("Failed to fetch the webpage.")

import requests from bs4 import BeautifulSoup, Comment url = "https://www.baseball-reference.com/boxes/ANA/ANA202305210.shtml" response = requests.get(url) response.raise_for_status() soup = BeautifulSoup(response.content, "html.parser") # the key part: # convert the HTML comment section  to new BeautifulSoup object new_soup = "" for c in soup.find_all(string=Comment): new_soup += c if c.strip().startswith("<") else "" new_soup = BeautifulSoup(new_soup, "html.parser") links = new_soup.find_all("a", href=lambda href: href and "/players/" in href) for link in links: href = link.get("href") print(f"{link.text:<30} {href}")

Joey Gallo /players/g/gallojo01.shtml Carlos Correa /players/c/correca01.shtml Alex Kirilloff /players/k/kirilal01.shtml Edouard Julien /players/j/julieed01.shtml Kyle Farmer /players/f/farmeky01.shtml Trevor Larnach /players/l/larnatr01.shtml Willi Castro /players/c/castrwi01.shtml Donovan Solano /players/s/solando01.shtml Ryan Jeffers /players/j/jeffery01.shtml Pablo López /players/l/lopezpa01.shtml Jorge López /players/l/lopezjo02.shtml José De León /players/d/deleojo03.shtml ...and so on.

使用 Python 抓取baseballreference.com评分时遇到的问题

推荐答案

Python相关问答推荐

如何自动抓取以下CSV

按顺序合并2个词典列表

如何在Python脚本中附加一个Google tab(已经打开)

在Python中动态计算范围

如何在Polars中从列表中的所有 struct 中 Select 字段？

driver. find_element无法通过class_name找到元素'""

形状弃用警告与组合多边形和多边形如何解决

在ubuntu上安装dlib时出错

启用/禁用shiny 的自动重新加载

如何启动下载并在不击中磁盘的情况下呈现响应？

需要帮助重新调整python fill_between与数据点

Pandas Data Wrangling/Dataframe Assignment

无论输入分辨率如何，稳定扩散管道始终输出512 * 512张图像

python panda ExcelWriter切换动态公式到数组公式

通过追加列表以极向聚合

使用tqdm的进度条

简单 torch 模型测试：ModuleNotFoundError：没有名为'；Ultralytics.yolo'；

pytest、xdist和共享生成的文件依赖项

Matplotlib中的曲线箭头样式

普洛特利express 发布的人口普查数据失败