当我试图从YouTube频道获取视频时,我得到了None分.附件是我的代码:

url = "https://www.youtube.com/@ecbeuro/videos"

service = Service("/usr/bin/chromedriver")

options = Options()
options.add_argument("--headless")
options.add_argument("--no-sandbox")
options.add_argument('--disable-blink-features=AutomationControlled')

driver = webdriver.Chrome(service=service, options=options)
driver.get(url)

news_links = driver.find_elements(By.XPATH, '//*[@id="video-title"]')
for link in news_links:
  print(link.get_attribute('href'))

感谢您的帮助.致以亲切的问候!

推荐答案

  • 如果你打算从YouTube频道中提取一些最新的视频,你可以只使用Pythonrequests库来检出this30 latest个视频.

  • 但如果你想把YouTube频道上所有可用的视频都刮掉,你需要多次滚动才能加载更多/所有可用的视频.要实现这一点,您可以使用Selenium.

import time
from selenium.webdriver import Chrome, ChromeOptions
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.support.wait import WebDriverWait

options = ChromeOptions()
options.add_argument("--start-maximized")
options.add_experimental_option("excludeSwitches", ["enable-automation"])
options.add_experimental_option("useAutomationExtension", False)

driver = Chrome(options=options)
wait = WebDriverWait(driver, 10)
driver.get("https://www.youtube.com/@ecbeuro/videos")

last_height = 0
print("Start scrolling!")
while True:

    driver.execute_script("window.scrollTo(0, document.documentElement.scrollHeight);")
    new_height = driver.execute_script("return document.documentElement.scrollHeight")
    wait.until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, 'div#dismissible')))
    if last_height == new_height:
        print("Stop scrolling, reached the bottom!")
        break
    else:
        last_height = new_height

    time.sleep(1)

data = []
videos = driver.find_elements(By.CSS_SELECTOR, 'div#dismissible')
for video in videos:
    url = video.find_element(By.CSS_SELECTOR, 'div#thumbnail>ytd-thumbnail>a').get_attribute('href')

    details = video.find_element(By.CSS_SELECTOR, 'div#details')
    title = details.find_element(By.CSS_SELECTOR, 'div#meta>h3').text
    views_date = details.find_elements(By.CSS_SELECTOR, 'span.inline-metadata-item.style-scope.ytd-video-meta-block')
    views = views_date[0].text.strip()
    date = views_date[1].text.strip()

    data.append({"title": title, "url": url, "views": views, "posted_date": date})

print(f"Total videos: {len(data)}")
print(data)

输出:

Start scrolling!
Stop scrolling, reached the bottom!
Total videos: 1480
[{'title': 'President Lagarde presents the latest monetary policy decisions – 27July 2023', 'url': 'https://www.youtube.com/watch?v=eUlRXBy3pBU', 'views': '2.3K views', 'posted_date': '5 days ago'}, {'title': 'Panel Discussion at the CESEE Conference 2023', 'url': 'https://www.youtube.com/watch?v=YMir_50lWhc', 'views': '746 views', 'posted_date': '13 days ago'}, {'title': 'Panel Discussion number 2 and Closing remarks at the CESEE Conference 2023', 'url': 'https://www.youtube.com/watch?v=bt-_Scd3864', 'views': '432 views', 'posted_date': '13 days ago'}, {'title': 'Civil Society Seminar Series: The evolution of European banking supervision', 'url': 'https://www.youtube.com/watch?v=d6MfWiHfua8', 'views': '412 views', 'posted_date': '2 weeks ago'}, {'title': 'Keynote speech of Valdis Dombrovskis at the CESEE Conference 2023', 'url': 'https://www.youtube.com/watch?v=-GnWOVKxFEk', 'views': '209 views', 'posted_date': '2 weeks ago'}, {'title': "Christine Lagarde's opening remarks for the CESEE Conference 2023", 'url': 'https://www.youtube.com/watch?v=pQzcvSXlI0M', 'views': '1K views', 'posted_date': '2 weeks ago'}, {'title': 'Keynote speech of Beata Javorcik at the CESEE Conference 2023', 'url': 'https://www.youtube.com/watch?v=rZITdIZYxBQ', 'views': '397 views', 'posted_date': '2 weeks ago'}, {'title': 'Civil Society Seminar Series: A digital euro for everyone', 'url': 'https://www.youtube.com/watch?v=gXD_7BDIn8Q', 'views': '1.8K views', 'posted_date': '2 weeks ago'}, {'title': 'New Euro banknotes re-design survey', 'url': 'https://www.youtube.com/watch?v=-ynWm1sYA9Q', 'views': '39K views', 'posted_date': '3 weeks ago'}, ....... {'title': 'ECB Press Conference - 13 January 2011 - Part 1/2', 'url': 'https://www.youtube.com/watch?v=fl_zhb4lW6c', 'views': '434 views', 'posted_date': '12 years ago'}, {'title': 'Preisstabilität: Warum ist sie für dich wichtig?', 'url': 'https://www.youtube.com/watch?v=6bSdXmxFcEE', 'views': '76K views', 'posted_date': '12 years ago'}, {'title': 'A estabilidade de preços é importante porquê?', 'url': 'https://www.youtube.com/watch?v=v4Zmx5OsKM8', 'views': '16K views', 'posted_date': '12 years ago'}, {'title': 'La stabilité des prix : pourquoi est-elle importante pour vous ?', 'url': 'https://www.youtube.com/watch?v=0xqcKYG9ax4', 'views': '37K views', 'posted_date': '12 years ago'}, {'title': 'Price stability: why is it important for you ?', 'url': 'https://www.youtube.com/watch?v=F6PvX625JCs', 'views': '66K views', 'posted_date': '12 years ago'}, {'title': 'Hinnastabiilsus – miks see on oluline?', 'url': 'https://www.youtube.com/watch?v=LhdGJ_g8k2M', 'views': '2.5K views', 'posted_date': '12 years ago'}, {'title': 'ECB Press Conference - 2 December 2010 - Part 1/2', 'url': 'https://www.youtube.com/watch?v=KsHgS6VslIk', 'views': '263 views', 'posted_date': '12 years ago'}, {'title': 'ECB Press Conference - 2 December 2010 - Part 2/2', 'url': 'https://www.youtube.com/watch?v=SP8PCanl93o', 'views': '221 views', 'posted_date': '12 years ago'}, {'title': 'ECB Statistics', 'url': 'https://www.youtube.com/watch?v=FyHiyPYyDp0', 'views': '3.3K views', 'posted_date': '12 years ago'}, {'title': 'The ECB launches new educational games', 'url': 'https://www.youtube.com/watch?v=HMIsUkNWKnE', 'views': '1.7K views', 'posted_date': '12 years ago'}, {'title': 'ECB - Inflation Island and Economia: Educational Games', 'url': 'https://www.youtube.com/watch?v=hcQUJSz82oQ', 'views': '9.5K views', 'posted_date': '12 years ago'}]

Python相关问答推荐

将HLS纳入媒体包

如何使用SubProcess/Shell从Python脚本中调用具有几个带有html标签的参数的Perl脚本?

运行回文查找器代码时发生错误:[类型错误:builtin_index_or_system对象不可订阅]

使用SciPy进行曲线匹配未能给出正确的匹配

SQLGory-file包FilField不允许提供自定义文件名,自动将文件保存为未命名

如何使用根据其他值相似的列从列表中获取的中间值填充空NaN数据

Python解析整数格式说明符的规则?

如何将一个动态分配的C数组转换为Numpy数组,并在C扩展模块中返回给Python

Streamlit应用程序中的Plotly条形图中未正确显示Y轴刻度

DataFrames与NaN的条件乘法

关于Python异步编程的问题和使用await/await def关键字

如何指定列数据类型

CommandeError:模块numba没有属性generated_jit''''

如何使用OpenGL使球体遵循Python中的八样路径?

30个非DATETIME天内的累计金额

从一个df列提取单词,分配给另一个列

如何求相邻对序列中元素 Select 的最小代价

使用类型提示进行类型转换

如何强制向量中的特定元素在Gekko中处于优化解决方案中

为什么Visual Studio Code说我的代码在使用Pandas concat函数后无法访问?