从浏览器复制的 CSS Select 器在 Python 中使用 BeautifulSoup4 返回不同的结果

发布于05月29日

通常，当我想从网站上刮取特定文本时，我会右键单击文本并 Select inspect.然后在HTML代码中，我查找我感兴趣的文本和100.

然后我将刚才复制的一串文本粘贴到soup中. Select ("在此处输入复制的文本")并将其保存到变量中.然后，我可以执行文本剥离功能，以获取所需的关键文本.

现在对于我正在处理的情况，我想得到这个网页标题h1: carsales.com.au/cars/used/toyota/rav4/.中显示的汽车总数，截至目前，这个数字是1712辆.

这是我的代码:

import requests
from bs4 import BeautifulSoup as bs

url = "https://www.carsales.com.au/cars/used/toyota/rav4/"
headers = {'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/101.0.4951.41 Safari/537.36'}

res = requests.get(url,headers = headers)
res.raise_for_status()

# # prints entire website
# print(res.text)

# # if this gives 200, then you're good to go.
#print(res.status_code)

soup = bs(res.text, 'html.parser')

# # This one gets how many cars are available from the search link. 
# # This is the alternate way as the soup.select method is not working.
# header_h1 = soup.find_all('h1')
# print(header_h1) 


total_cars_element = soup.select('body > div.listing > div.container.listing-container.has-header-sticky > div.row.flex-nowrap.no-gutters > div:nth-child(1) > div:nth-child(1) > div')

print(total_cars_element)
# the above prints an empty list.

我真的只是想知道为什么这不起作用.我知道还有我在上面的代码中提到的其他解决方法.但我真的想继续喝汤. Select 方法.

非常感谢您的任何见解！

从浏览器复制的 CSS Select 器在 Python 中使用 BeautifulSoup4 返回不同的结果

推荐答案

Python相关问答推荐

KNN分类器中的GridSearchCV

在Windows上启动新Python项目的正确步骤顺序

如何在Python中使用ijson解析SON期间检索文件位置？

Python -根据另一个数据框中的列编辑和替换数据框中的列值

Python 3.12中的通用[T]类方法隐式类型检索

类型错误：输入类型不支持ufuncisnan-在执行Mann-Whitney U测试时[SOLVED]

加速Python循环

DataFrames与NaN的条件乘法

关于Python异步编程的问题和使用await/await def关键字

多指标不同顺序串联大Pandas 模型

将pandas导出到CSV数据，但在此之前，将日期按最小到最大排序

如何合并两个列表，并获得每个索引值最高的列表名称？

在matplotlib中删除子图之间的间隙_mosaic

Polars asof在下一个可用日期加入

如何过滤组s最大和最小行使用`transform`'

数据框，如果值在范围内，则获取范围和

提取最内层嵌套链接

如何编辑此代码，使其从多个EXCEL文件的特定工作表中提取数据以显示在单独的文件中

将字节序列解码为Unicode字符串

分解polars DataFrame列而不重复其他列值