我正试图使用selenium(合法地)从一个网站上获取数据.我try 了该代码,但没有显示任何结果.我想先获得数据,看看我在按钮上做了什么,然后再向您展示HTML是如何显示的.

I just need to get "31 954 kr" from the hmtl. enter image description here

完整的html:

<article class="c-product-tile h-full relative product-}">
            <a class="c-product-tile__link" tabindex="0" title="Balance  Spisebord" ng-click="$ctrl.handleClick($event)" pdp-link="c/03-102-01_10991348" outlet="$ctrl.product.type === 'outletProduct'" target="_self" href="/nb-no/mot-oss/butikker/online-outlet/produkt/c/03-102-01_10991348/">
        </a>
    
    
        <div class="flex h-full flex-col relative">
            <div class="c-product-tile__image bg-wild-sand px-6 py-28 mb-4 relative md:px-14 md:py-32 lg:px-12 lg:py-60">
                    <!----><div class="c-aspect-ratio-16x9 anim-fade-in  scale-" ng-if="::$ctrl.product.type !== 'legs' || !$ctrl.product.useLegImages" style="">
            <tile-image base-src="/products/03-102-01_10991348.webp" template="tileImageProductTile" is-product-image="true" move-pressmode-btn=".product- .c-product-tile__image" cachebust="0" mode="pad" aspect-ratio="16x9" default-on-error-url="/layout/images/produkt-default-medium-1.png" bgcolor="transparent" description="Balance  Spisebord"><!----><div class="c-tile-image" ng-if="$ctrl.readyForLoad" style="">
        <!---->
    
        <!---->
        <img src="/layout/images/7x3.gif" lazy-src="https://images.bolia.com/cdn-cgi/image/background=transparent,fit=pad,width=500,format=auto,height=281,quality=81/products/03-102-01_10991348.webp?v=0" lazy-srcset="https://images.bolia.com/cdn-cgi/image/background=transparent,fit=pad,width=340,format=auto,height=191,quality=81/products/03-102-01_10991348.webp?v=0 340w,https://images.bolia.com/cdn-cgi/image/background=transparent,fit=pad,width=540,format=auto,height=303,quality=81/products/03-102-01_10991348.webp?v=0 540w" load-immediately="$ctrl.loadImmediately" sizes="153px" alt="Balance  Spisebord" on-error="/layout/images/produkt-default-medium-1.png" ng-class="{'tile-image__takeover' : $ctrl.takeOver, '': $ctrl.cssClass !== undefined}" srcset="https://images.bolia.com/cdn-cgi/image/background=transparent,fit=pad,width=340,format=auto,height=191,quality=81/products/03-102-01_10991348.webp?v=0 340w,https://images.bolia.com/cdn-cgi/image/background=transparent,fit=pad,width=540,format=auto,height=303,quality=81/products/03-102-01_10991348.webp?v=0 540w">
    </div><!---->
    </tile-image>
        </div><!---->
    
                    <div class="absolute pin my-6">
            <div class="absolute pin-t pin-l">
                <!---->
                <!---->
            </div>
            <!---->
            <div class="absolute pin-l pin-b pin-r flex justify-between">
                <!---->
                <div class="pl-2 pr-4 w-full flex flex-wrap-reverse flex-col justify-self-end justify-end">
                    <!---->
                    <!----><!----><!---->
                </div>
            </div>
        </div>
    
            </div>
                <!----><div ng-if="!$ctrl.noPrice" class="" style="">
            <!---->
    
            <!----><div ng-if="::$ctrl.product.type === 'outletProduct'" class="mb-4">
                <p class="c-product-tile__title c-text-caption m-0 font-bold">
                    <span class="c-text-caption" ng-bind="$ctrl.product.title">Balance  Spisebord</span>
                </p>
                <p class="c-text-caption truncate m-0 hidden md:block" ng-bind="::$ctrl.product.designInformation">Brun marmor</p>
                <p class="c-text-caption m-0" ng-bind="::$ctrl.product.details">God stand</p>
            </div><!---->
    
            <!----><div ng-if="::$ctrl.showFromPrice() || $ctrl.showSalesPrice()" class="m-0 flex flex-wrap items-baseline">
                <!----><p ng-if="!$ctrl.isMyBoliaBoostDiscount()" class="flex flex-wrap c-text-caption m-0">
                    <!----><span ng-if="$ctrl.showSalesPrice()" class="flex flex-wrap items-baseline">
                        <!----><span ng-if="$ctrl.showListPrice()" class="mr-3 font-bold" ng-class="{'font-bold':  $ctrl.showListPrice()}" ng-bind="::$ctrl.product.salesPrice.amount">31&nbsp;954 kr.</span><!---->
                        <span ng-class="{'line-through': $ctrl.showListPrice()}" class="mr-3 line-through" ng-bind="::$ctrl.product.listPrice.amount">63&nbsp;909 kr.</span>
                        <!----><span ng-if="$ctrl.showDiscountText()" ng-style="::$ctrl.splashLabelService.labelStyle()" class="c-text-caption mr-3 flex items-center my-0 px-2 py-1 bg-brandy" style="background: rgb(225, 221, 212); color: rgb(0, 0, 0);">
                            Spar 50%
                        </span><!---->
                    </span><!---->
                    <!---->
                </p><!---->
                <!---->
                <!---->
            </div><!---->
            <!---->
            <!---->
            <!---->
        </div><!---->
        <!---->
    
        </div>
    </article>

我的PYTHON代码(Selify)

    from selenium import webdriver
    from selenium.webdriver.common.by import By
    from selenium.webdriver.support.ui import WebDriverWait
    from selenium.webdriver.support import expected_conditions as EC


driver = webdriver.Chrome()


driver.get("wwww.scrapedatafromhere.com")


element = WebDriverWait(driver, 10).until(
    EC.visibility_of_element_located((By.XPATH, "(//article[contains(@class, 'c-product-tile')]//p[contains(@class, 'c-product-tile__title')]/span)[1]"))
)

text = element.text
print("Text:", text)

结果是: empty(无错误) 我在这里做错了什么?

推荐答案

更简单的方法是使用AJAX API并获取名称/价格/...从那里开始:

import requests

# url = "https://www.bolia.com/sv-se/soffor/hornsoffor/?family=scandinavia remix&lastfacet=family"
api_url = "https://www.bolia.com/api/search"

params = {
    "family": "scandinavia remix",
    "includerangelimits": "true",
    "language": "sv-se",
    "lastfacet": "family",
    "mode": "category",
    "pageLink": "5768",
    "productGroupId": "30136",
    "v": "2024.7620.0207.1-23",
}

data = requests.get(api_url, params=params).json()
# print(data)

for p in data["products"]["results"][0]["results"]:
    print(f'{p["title"]:<60} {p["salesPrice"]["amount"]}')

打印:

Scandinavia Remix 5 pers. Hörnsoffa (2 Hörn 2)               60 039 kr.
Scandinavia Remix 5 pers. Hörnsoffa (2½ hörn 2)              61 179 kr.
Scandinavia Remix 5 pers. Hörnsoffa (2 hörn 2½)              61 179 kr.
Scandinavia Remix 6 pers. Hörnsoffa (2½ hörn 2½)             62 509 kr.
Scandinavia Remix 6 pers. Hörnsoffa (3 hörn 2)               65 069 kr.
Scandinavia Remix 6 pers. Hörnsoffa (2 hörn 3)               65 069 kr.
Scandinavia Remix 5 pers. hörnsoffa med Schäslong vänstre - Open end höger 77 519 kr.
Scandinavia Remix 5 pers. hörnsoffa med Schäslong höger - Open end vänster 77 519 kr.
Scandinavia Remix 4 pers. hörnsoffa - Open end vänster       59 279 kr.
Scandinavia Remix 4 pers. hörnsoffa - Open end höger         59 279 kr.
Scandinavia Remix 5 pers. hörnsoffa med open end - vänster   60 609 kr.
Scandinavia Remix 5 pers. hörnsoffa med Open end - höger     60 609 kr.
Scandinavia Remix 6 pers. hörnsoffa - Open end vänster       64 309 kr.
Scandinavia Remix 6 pers. hörnsoffa - Open end höger         64 309 kr.

Python相关问答推荐

具有多个选项的计数_匹配

Pandas 第二小值有条件

韦尔福德方差与Numpy方差不同

试图找到Python方法来部分填充numpy数组

如何使用html从excel中提取条件格式规则列表?

从numpy数组和参数创建收件箱

log 1 p numpy的意外行为

我如何根据前一个连续数字改变一串数字?

pandas:排序多级列

如何在turtle中不使用write()来绘制填充字母(例如OEG)

在两极中过滤

如何检测鼠标/键盘的空闲时间,而不是其他输入设备?

替换现有列名中的字符,而不创建新列

基于多个数组的多个条件将值添加到numpy数组

通过追加列表以极向聚合

polars:有效的方法来应用函数过滤列的字符串

语法错误:文档. evaluate:表达式不是合法表达式

ModuleNotFoundError:Python中没有名为google的模块''

Python如何导入类的实例

.awk文件可以使用子进程执行吗?