我正在构建一个代码,将检索航空公司 comments 网站上的所有 comments 标题.我使用5个不同的URL,因为我想比较5个不同的航空公司之间的标题.然而,我的代码只列出了列出的最后一个URL的 comments 标题,这是针对阿拉斯加航空公司的.我最初创建了一个将所有URL放在一起的列表,但它有完全相同的错误,只显示阿拉斯加航空公司的结果.
我的代码:
# Insert the following command into the command prompt before starting for faster run time:
# jupyter notebook --NotebookApp.iopub_data_rate_limit=1.0e10
#Importing and installing necessary packages
!pip install lxml
from bs4 import BeautifulSoup
import requests
import pandas as pd
from pprint import pprint;
base_url = 'https://www.airlinequality.com/airline-reviews/'
ending = ['american-airlines', 'delta-air-lines', 'united-airlines',
'southwest-airlines', 'alaska-airlines']
for ending in endings:
url = base_url + ending
r = requests.get(url)
soup = BeautifulSoup(r.content, 'html.parser')
results = soup.find('div', id='container')
# Retrieving all reviews
titles = results.find_all('h2', class_='text_header')
for title in titles:
print(title, end="\n"*2)
我的输出:
<h2 class="text_header">"first class customer service"</h2>
<h2 class="text_header">"deeply unsatisfactory"</h2>
<h2 class="text_header">"Everything was just fabulous"</h2>
<h2 class="text_header">"Messed up airline"</h2>
<h2 class="text_header">"agents who obviously care so much" </h2>
<h2 class="text_header">“communication was sorely lacking”</h2>
<h2 class="text_header">"Never encountered ruder gate workers"</h2>
<h2 class="text_header">"Our check-in bag was badly damaged"</h2>
<h2 class="text_header">"never book again with Alaska Airlines"</h2>
<h2 class="text_header">"I could not get on the plane"</h2>
<h2 class="text_header">The Worlds Best Airlines</h2>
<h2 class="text_header">THE NICEST AIRPORT STAFF</h2>
<h2 class="text_header">THE CLEANEST AIRLINE</h2>
<h2 class="text_header">Alaska Airlines Photos</h2>
我希望得到这个输出,但所有5个网址.如何从所有URL中检索 comments 标题?