Python 如何在 Beautifulsoup 的 find_all() 函数中过滤没有属性的标签

发布于03月09日

下面是我正在使用的一个简单的html源代码

<html>
<head>
<title>Welcome to the comments assignment from www.py4e.com</title>
</head>
<body>
<h1>This file contains the actual data for your assignment - good luck!</h1>

<table border="2">
<tr>
<td>Name</td><td>Comments</td>
</tr>
<tr><td>Melodie</td><td><span class="comments">100</span></td></tr>
<tr><td>Machaela</td><td><span class="comments">100</span></td></tr>
<tr><td>Rhoan</td><td><span class="comments">99</span></td></tr>

下面是我try 获得<td>Melodie</td>行的代码

html='html text file aboved'

soup=BeautifulSoup(html,'html.parser')

    for tag in soup.find_all('td'):
        print(tag) 
        print('----') #Result:
#===============================================================================
# <td>Name</td>
# ----
# <td>Comments</td>
# ----
# <td>Melodie</td>
# ----
# <td><span class="comments">100</span></td>
# ----
# <td>Machaela</td>
# ----
# <td><span class="comments">100</span></td>
# ----
# <td>Rhoan</td>
# ----
#.........
#===============================================================================

现在我只想得到<td>name<td>行，而不是带有"span"和"class"的行.我try 了两个过滤器soup.find_all('td' and not 'span')和soup.find_all('td', attrs={'class':None})，但都不起作用.我知道还有其他方法，但我想在汤中使用过滤器.查找所有().

# <td>Name</td>
# ----
# <td>Comments</td>
# ----
# <td>Melodie</td>
# ----
# <td>Machaela</td>
# ----
# <td>Rhoan</td>
# ----

from bs4 imp或t BeautifulSoup html=urllib.request.urlopen('http://py4e-data.dr-chuck.net/comments_1430669.html').read() soup=BeautifulSoup(html,'html.parser') f或 e in soup.select('td:not(:has(span))'): print(e)

Python 如何在 Beautifulsoup 的 find_all() 函数中过滤没有属性的标签

推荐答案

实例

输出

Python相关问答推荐

替换字符串中的点/逗号，以便可以将其转换为浮动

根据条件将新值添加到下面的行或下面新创建的行中

带条件计算最小值

为什么符号没有按顺序添加？

为什么这个带有List输入的简单numba函数这么慢

为什么默认情况下所有Python类都是可调用的？

Python解析整数格式说明符的规则？

pandas：排序多级列

迭代嵌套字典的值

转换为浮点，pandas字符串列，混合千和十进制分隔符

不允许访问非IPM文件夹

使用groupby方法移除公共子字符串

Python导入某些库时非法指令(核心转储)(beautifulsoup4."" yfinance)

索引到 torch 张量，沿轴具有可变长度索引

如何从列表框中 Select 而不出错？

如何在Python中使用Pandas将R s Tukey s HSD表转换为相关矩阵''

python—telegraph—bot send_voice发送空文件

为用户输入的整数查找根/幂整数对的Python练习

提取数组每行的非零元素

如何在Python中将超链接添加到PDF中每个页面的顶部？