Python 使用 BeautifulSoup 查找单词

发布于08月17日

我想从一个网站上提取包含两个特殊的波斯语单词"توافق"或"توافقی"的广告.我正在用BeautifulSoup和拆分汤中的内容来寻找有我特殊词汇的广告，但我的代码不起作用，你能帮帮我吗？以下是我的简单代码:

import requests
from bs4 import BeautifulSoup

r = requests.get("https://divar.ir/s/tehran")
soup = BeautifulSoup(r.text, "html.parser")
results = soup.find_all("div", attrs={"class": "kt-post-card__body"})
for content in results:
    words = content.split()
    if words == "توافقی" or words == "توافق":
        print(content)

推荐答案

因为توافقی出现在带有kt-post-card__description个类div标记中，所以我将使用这个.然后，您可以通过使用标记的属性(如.previous_sibling或.parent或其他任何属性)来获得添加.

import requests
from bs4 import BeautifulSoup

r = requests.get("https://divar.ir/s/tehran")
soup = BeautifulSoup(r.text, "html.parser")
results = soup.find_all("div", attrs={"class": "kt-post-card__description"})
for content in results:
    text = content.text
    if "توافقی" in text or "توافق" in text:
        print(content.previous_sibling)   # It's the h2 title.