原来,标记名应该是:"ix:NONFRATION"
这不管用.没有找到"xi"标签.
from bs4 import BeautifulSoup
text = """
<td style="BORDER-BOTTOM:0.75pt solid #7f7f7f;white-space:nowrap;vertical-align:bottom;text-align:right;">$ <ix:nonfraction name="ecd:AveragePrice" contextref="P01_01_2022To12_31_2022" unitref="Unit_USD" decimals="2" scale="0" format="ixt:num-dot-decimal">97.88</ix:nonfraction>
</td>
"""
soup = BeautifulSoup(text, 'lxml')
print(soup)
ix_tags = soup.find_all('ix')
print(ix_tags)
但下面的工作.我看不出有什么区别.为什么会这样?多谢了!
html_content = """
<html>
<body>
<ix>Tag 1</ix>
<ix>Tag 2</ix>
<ix>Tag 3</ix>
<p>Not an ix tag</p>
</body>
</html>
"""
soup = BeautifulSoup(html_content, 'lxml')
ix_tags = soup.find_all('ix')
for tag in ix_tags:
print(tag.text)