我开始涉足网络抓取的世界.我的try 是如此简单.我正在下载并安装Firefox.然后,我转到Google,为了确认代码是否正常工作,我正在try 截屏并将其保存到一个临时路径中,以便在最后显示.
不幸的是,当我执行代码时,似乎没有任何错误,但当它完成时,我只得到一个完全空白的图像,如下所示:
我的代码如下所示:
正在安装Selify CMD:
!pip install selenium
!pip install webdriver-manager
Firefox下载并安装在临时路径cmd中:
%sh
wget https://ftp.mozilla.org/pub/firefox/releases/106.0.1/linux-x86_64/en-US/firefox-106.0.1.tar.bz2 -O /tmp/firefox.tar.bz2
%sh
tar -xvf /tmp/firefox.tar.bz2 -C /tmp/
%sh
sudo apt-get update -y
%sh
sudo apt-get install -y wget bzip2 libxtst6 libgtk-3-0 libx11-xcb-dev libdbus-glib-1-2 libxt6 libpci-dev libudev-dev
%sh
rm -rf /var/lib/apt/lists/*
算法CMD:
from selenium import webdriver
from selenium.webdriver.firefox.service import Service
from selenium.webdriver.firefox.options import Options
from webdriver_manager.firefox import GeckoDriverManager
import time
from IPython.display import Image
# Temporary path
screenshot_path = '/tmp/web.png'
# Firefox controller configuration
service = Service(executable_path=GeckoDriverManager().install())
options = Options()
options.set_preference("browser.download.folderList", 2)
options.set_preference("browser.download.manager.showWhenStarting", False)
options.set_preference("browser.download.dir", '/tmp/head_count_data/')
options.headless = False
options.add_argument('--headless')
options.binary_location = '/tmp/firefox/firefox'
driver = webdriver.Firefox(options=options, service=service)
login_url = "https://www.google.com/"
time.sleep(20)
# Screenshot
driver.save_screenshot(screenshot_path)
time.sleep(20)
# Show results
Image(filename=screenshot_path)
我已经try 了很多次,我重新启动了内核,将图像格式改为.jpg,然后我用(我在结果中得到的是9.2版本)签出了我已经安装了PIL库.