我正在try 获取所有亚马逊畅销书的数据,并处理我已经使用的scrapy,我能够获得数据的整个 Select 器列表,而迭代数据列表时,结果仍然只返回单个数据项.
def parse_page(self, response):
product_data = response.xpath("//div[@id='gridItemRoot']") #THIS RETURNS A SELECTOR LIST
for data in product_data:
product_name = data.xpath("//div[@class='a-section a-spacing-mini _cDEzb_noop_3Xbw5']//img/@alt").get()
product_rank = data.xpath("//span[@class='zg-bdg-text']/text()").get()
# It only generates a single result
yield {
"name": product_name,
"rank": product_rank
}
我try 不迭代 Select 列表,而是将 Select 器直接传递给方法并产生结果,但这也返回了单个元素.
def parse_page(self, response):
# in previous applications all the results were scraped without iterating over any selectorlist just like following
product_name = response.xpath("//div[@class='a-section a-spacing-mini _cDEzb_noop_3Xbw5']//img/@alt").get()
product_rank = response.xpath("//span[@class='zg-bdg-text']/text()").get()
yield {
"name": product_name,
"rank": product_rank
}