<div class="NewsList_newsListContent__4UpiN">
<div>
<div>
<div class="NewsList_newsListItemWrap__XovMP">
<div style="display: flex;">
<div class="NewsList_newsListItem__yRAbe">
<a href="/flash-categories/Currency">
<div class="NewsList_newsListTag__TGHJ_">
<span>Currency</span>
</div></a></div></div>
<div class="NewsList_newsListContent__4UpiN">
<div class="NewsList_infoNewsListSubMobile__SPmAG">
<span>06 Jun 2023, 10:05 am </span>
</div>
<div class="NewsList_newsListText__hstO7">
<a href="/node/669947">
# <span class="NewsList_newsListItemHead__dg7eK"**>Ringgit lower against US dollar in early session on June 6**</span>
</a>
<a href="/node/669947">
<span class="NewsList_newsList__2fXyv">KUALA LUMPUR (June 6): The ringgit opened lower against   the US dollar in the early session on Tuesday (June 6), as investors remain cautious on the global outlook despite a slightly weaker greenback, an analyst said.&nbsp;At 9am, the local note fell to 4.5950/6000 versus the greenback, compared with Friday (June 2)’s closing of&nbsp;4.5745/5785.  </span>
</a>
</div>

例如:我想刮掉上面这个大胆的词:林吉特兑美元在6月6日早盘走低

这是我的 playbook :

import requests
from bs4 import BeautifulSoup

url = "https://theedgemalaysia.com/categories/malaysia"

# Send a GET request to the URL
response = requests.get(url)

# Create a BeautifulSoup object to parse the HTML content
soup = BeautifulSoup(response.text, 'html.parser')

# Find all <div> elements with class "NewsList_newsListContent__4UpiN"
container_divs = soup.find_all('div', class_='NewsList_newsListContent__4UpiN')

# Iterate over the container divs
for container_div in container_divs:
    # Find all <div> elements with class "NewsList_newsListText__hstO7" within the container
    news_text_divs = container_div.find_all('div', class_='NewsList_newsListText__hstO7')

    # Iterate over the news text divs
    for news_text_div in news_text_divs:
        # Find the <span> element with class "NewsList_newsListItemHead__dg7eK" within the news text div
        headline_span = news_text_div.find('span', class_='NewsList_newsListItemHead__dg7eK')

        # Print the text of the headline
        if headline_span:
            print(headline_span.text)

我已经try 了上面的脚本,但找不到错误,这里的任何人都可以看看,让我知道问题在哪里?非常感谢!

推荐答案

该页面是由JS基于script标签中的一些现有信息形成的.请求不能执行Java脚本,所以当您在支持JS的浏览器中访问页面时,它将不会看到这些标题.

以下是获得这些头衔的一种方法:

from bs4 import BeautifulSoup as bs
import requests
import pandas as pd
import json 
headers= {
    'User-Agent':'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/113.0.0.0 Safari/537.36'
}

url = 'https://theedgemalaysia.com/categories/malaysia'
r = requests.get(url, headers=headers)
soup = bs(r.text, 'html.parser')

data_script = soup.select_one('script[id="__NEXT_DATA__"]')
data = json.loads(data_script.string)
df = pd.json_normalize(data['props']['pageProps']['corporateData'])
print(df)

结果为终端:

    nid     type    language    category    options     flash   tags    edited  title   created     updated     author  source  audio   audioflag   alias   video_url   img     caption     summary
0   669998  article     english     Corporate,Malaysia  Top Stories     Noon Market             Bursa stays in the red at midday    1686027040000   1686027040000   Bernama     Bernama         0   node/669998         https://assets.theedgemarkets.com/noon-market-...       Bursa Malaysia stayed in the red at midday due...
1   669997  article     english     Corporate,Malaysia                  Skyworld eyes 3Q Main Market listing, inks und...   1686026908000   1686026908000   Lam Jian Wyn    theedgemalaysia.com         0   node/669997         https://assets.theedgemarkets.com/SkyWorld-Dev...       KUALA LUMPUR (June 6): SkyWorld Development Bh...
2   669995  article     english     Malaysia    Top Stories,Politics & Government   Parliament          Investigation into Kedah MB over comments Pena...   1686026473000   1686026473000   Hailey Chung & Chester Tay  theedgemalaysia.com         0   node/669995         https://assets.theedgemarkets.com/Kedah Sanusi...   Kedah Menteri Besar Datuk Seri Muhammad Sanusi...   Police have started an investigation into Keda...
3   669984  article     english     Malaysia,Economy    Top Stories,Politics & Government   Parliament  mynewstv        Anwar defends BNM’s gradual approach to moneta...   1686025226000   1686025226000   Hailey Chung & Chester Tay  theedgemalaysia.com         0   node/669984         https://assets.theedgemarkets.com/Anwar 060620...       Higher borrowing costs and the sharp depreciat...
4   669980  article     english     Malaysia,World,Economy  Top Stories,Politics & Government   ESG             Global carbon markets face upheaval as nations...   1686024746000   1686024746000   Natasha White & Ewa Krukowska   Bloomberg       0   node/669980         https://assets.theedgemarkets.com/398972891-fo...       LONDON/BRUSSELS (June 6): The US$2 billion mar...
5   669961  article     english     Corporate,Malaysia              Isabelle Francis    CGS-CIMB starts coverage of Dayang Enterprise ...   1686022324000   1686022324000   Anis Hazim  theedgemalaysia.com         0   node/669961         https://assets.theedgemarkets.com/Dayang-Enter...       CGS-CIMB has initiated coverage of Dayang Ente...
6   669957  article     english     Malaysia    Politics & Government               Kit Siang expresses gratitude to Agong for 'Ta...   1686021406000   1686021406000   Bernama     Bernama         0   node/669957         https://assets.theedgemarkets.com/Lim-Kit-sian...       Veteran politician Tan Sri Lim Kit Siang expre...
7   669956  article     english     Corporate,Malaysia          mynewstv        1Q results came broadly below expectations, sa...   1686020951000   1686020951000   Isabelle Francis    theedgemalaysia.com         0   node/669956         https://assets.theedgemarkets.com/Bursa-Malays...       KUALA LUMPUR (June 6): Analysts said the first...
8   669954  article     english     Corporate,Management,Malaysia   Top Stories     ESG     mynewstv        24 public-listed companies still have no women...   1686020019000   1686020019000   Tan Zhai Yun    theedgemalaysia.com         0   node/669954         https://assets.theedgemarkets.com/Bursa-4_2023...       KUALA LUMPUR (June 6): As at June 1, 2023, 24 ...
9   669953  article     english     Corporate,Malaysia      Hot Stock   mynewstv    Lam Jian Wyn    Bumi Armada shares fall 20.47% on Kraken FPSO ...   1686019041000   1686019041000   Anis Hazim  theedgemalaysia.com         0   node/669953         https://assets.theedgemarkets.com/Bumi-Armada-...       KUALA LUMPUR (June 6): Shares of Bumi Armada B...
10  669951  article     english     Corporate,Malaysia,World        Global Markets          Asian stocks wobble as traders weigh Fed rate ...   1686018738000   1686018738000   Ankur Banerjee  Reuters         0   node/669951         https://assets.theedgemarkets.com/395135636-As...       SINGAPORE (June 6): Asian stock markets edged ...
11  669948  article     english     Malaysia,Court  Politics & Government           Lam Jian Wyn    High Court dismisses Zuraida’s leave applicati...   1686017713000   1686017713000   Tarani Palani   theedgemalaysia.com         0   node/669948         https://assets.theedgemarkets.com/Zuraida-Kama...   Ampang member of Parliament Datuk Zuraida Kama...   KUALA LUMPUR (June 6): The High Court has dism...
12  669947  article     english     Malaysia    Top Stories     Currency            Ringgit lower against US dollar in early sessi...   1686017126000   1686017126000   Bernama     Bernama         0   node/669947         https://assets.theedgemarkets.com/Ringgit-5_20...       KUALA LUMPUR (June 6): The ringgit opened lowe...
13  669945  article     english     Corporate,Malaysia      Market Open             Bursa Malaysia marginally higher in early sess...   1686016694000   1686016694000   Bernama     Bernama         0   node/669945         https://assets.theedgemarkets.com/opening-mark...       KUALA LUMPUR (June 6): Bursa Malaysia rebounde...

请参阅BeautifulSoup文档here,Pandas 文档请参见here.

Python相关问答推荐

如何将Matplotlib的fig.add_axes本地坐标与我的坐标关联起来?

将行从一个DF添加到另一个DF

覆盖Django rest响应,仅返回PK

有条件地采样我的大型DF的最有效方法

在Pandas框架中截短至固定数量的列

Python中MongoDB的BSON时间戳

使用scipy. optimate.least_squares()用可变数量的参数匹配两条曲线

仅从风格中获取 colored颜色 循环

Select 用a和i标签包裹的复选框?

Python daskValue错误:无法识别的区块管理器dask -必须是以下之一:[]

无法通过python-jira访问jira工作日志(log)中的 comments

Python键入协议默认值

如何在python polars中停止otherate(),当使用when()表达式时?

从一个系列创建一个Dataframe,特别是如何重命名其中的列(例如:使用NAs/NaN)

海上重叠直方图

给定高度约束的旋转角解析求解

索引到 torch 张量,沿轴具有可变长度索引

如何获取Python synsets列表的第一个内容?

提取最内层嵌套链接

如何使用pytest在traceback中找到特定的异常