我正在提取亚马逊畅销书的数据书名,作者姓名,和书的价格.对于这项任务,我使用的是美丽的Soup和Request库. URL是-https://www.amazon.in/gp/bestsellers/books/ [代码片段-价格信息HTML tree structure](https://i.stack.imgur.com/7K0rN.png)

我得到了一个空的价格信息列表.如何解决这个问题?

我判断了Stackoverflow中的一个帖子,它建议使用lxml.这也适用于我的情况吗?

推荐答案

下面是一个如何获得书名+价格的例子:

import requests
from bs4 import BeautifulSoup

headers = {
    "User-Agent": "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:121.0) Gecko/20100101 Firefox/121.0"
}

url = "https://www.amazon.in/gp/bestsellers/books/"

soup = BeautifulSoup(requests.get(url, headers=headers).content, "html.parser")

for book in soup.select("div#gridItemRoot"):
    title = book.select_one("a:not(:has(img))").text
    price = book.select_one(".a-color-price").text
    print(f"{price:<10} {title}")

打印:

₹115.00    The Power of Your Subconscious Mind: Original Edition | Premium Paperback
₹649.00    Indian Polity for UPSC (English)|7th Edition|Civil Services Exam| State Administrative Exams
₹97.00     MINtile Sank Magic Practice Copybook, (4 Book + 10 Refill) Number Tracing Book for Preschoolers with Pen, Magic Calligraphy Copybook Set Practical Reusable Writing Tool Simple Hand Lettering
₹511.00    Atomic Habits
₹220.00    The Psychology Of Money
₹409.00    My First Library: Boxset of 10 Board Books for Kids
₹184.00    Don't Believe Everything You Think (English)
₹339.00    Ikigai
₹156.00    Do It Today: Overcome procrastination, improve productivity and achieve more meaningful things [Paperback] Foroux, Darius
₹411.00    Educart 21 Days Challenge with Class 10 Final Revision Book for CBSE 2024 Board Exam- Maths + Science + SST + English (2023-24)
₹87.00     My First Book of Patterns Pencil Control: Patterns Practice book for kids (Pattern Writing)
₹97.00     201 Brain Booster Activity Book - Fun Activities and Exercises For Children: Tracing & Pattern, Colors & Shapes, Maze
₹545.00    Why Bharat Matters
₹180.00    BRAHMASTRA Complete Maths Multicolored Formula Book Second Edition BILINGUAL by Aditya Ranjan Sir
₹193.00    My First Mythology Tale (Illustrated) (Set of 5 Books) - Mahabharata, Krishna, Hanuman, Ganesha, Ramayana - Story Book for Kids - English Short Stories with Colourful Pictures - Read Aloud to Infants, Toddlers
₹296.00    Blackbook of English Vocabulary
₹97.00     You Can
₹211.00    Moral Story Books for Kids (Illustrated) - English Short Stories with Colourful Pictures - Bedtime Children Story Book - 3 Years to 6 Years Old Children - Read Aloud to Infants, Toddlers (Set of 10 Books)
₹97.00     Meditations
₹399.00    NTA UGC NET /SET/JRF Paper 1, Teaching and Research Aptitude – 2023, Includes latest 2022 paper and 2600+ Practice Questions with Solutions | Includes NEP - 2020| 7th Edition - By Pearson
₹97.00     The Power of Your Subconscious Mind
₹426.00    Rich Dad Poor Dad: 25th Anniversary Edit
₹187.00    Animals Tales From Panchtantra: Timeless Stories for Children From Ancient India
₹262.00    Oxford Student Atlas for India, Fourth Edition - Useful for Competitive Exams
₹93.00     How to Win Friends and Influence People : Original Edition | Premium Paperback
₹339.00    Colouring Books Boxset: Pack of 12 Copy Colour Books For Children
₹183.00    The Alchemist
₹156.00    Grandma's Bag of Stories: Collection of 20+ Illustrated short stories, traditional Indian folk tales for all ages for children of all ages by Sudha Murty [Paperback] Sudha Murty
₹147.00    Shlokas and Mantras - Activity Book For Kids - Illustrated Book With Engaging Activities and Sticker Sheets
₹94.00     Brain Activity Book for Kids - 200+ Activities for Age 3+

Python相关问答推荐

如何避免Chained when/then分配中的Mypy不兼容类型警告?

如何请求使用Python将文件下载到带有登录名的门户网站?

DataFrames与NaN的条件乘法

使用Python查找、替换和调整PDF中的图像'

如何使用使用来自其他列的值的公式更新一个rabrame列?

Python避免mypy在相互引用中从另一个类重定义类时失败

为什么在FastAPI中创建与数据库的连接时需要使用生成器?

在Google Drive中获取特定文件夹内的FolderID和文件夹名称

如何在Gekko中使用分层条件约束

Tensorflow tokenizer问题.num_words到底做了什么?

提取数组每行的非零元素

我可以不带视频系统的pygame,只用于游戏手柄输入吗?''

为什么在Python中00是一个有效的整数?

对数据帧进行分组,并按组间等概率抽样n行

仅取消堆叠最后三列

如何关联来自两个Pandas DataFrame列的列表项?

如何批量训练样本大小为奇数的神经网络?

Django查询集-排除True值

如何有效地计算所有输出相对于参数的梯度?

使用pyopencl、ArrayFire或另一个Python OpenCL库制作基于欧几里得距离的掩模