我目前正在使用#Python进行数据提取任务.数据位于Power BI仪表板中,事实证明,访问该仪表板非常困难.I have come across a solution on Stack Overflow 美元,但它不起作用.我不确定从我的页面访问组件和提取表格所需的导航过程.以下是我到目前为止实现的代码:

import requests
from bs4 import BeautifulSoup
import pandas as pd

# URL da página
url = "https://app.powerbi.com/view?r=eyJrIjoiNTUyNTk4YmUtN2U4Ny00N2MzLWFlM2UtYTY1ZGNhYTA5N2NmIiwidCI6IjFlZjViNjViLTkxYjktNGVjMS1iNmU0LTc3YTA1MzcxNTk1MyJ9"

# Realiza a requisição GET para obter o conteúdo da página
response = requests.get(url)
content = response.content

# Utiliza o BeautifulSoup para fazer o parsing do HTML
soup = BeautifulSoup(content, "html.parser")

# Encontra o elemento do código do iframe
iframe_element = soup.find("div", {"data-element-id": "elm_8HNLtJp5glVRnSc1LAaKwg"})

# Extrai o link do iframe
iframe_src = iframe_element.find("iframe")["src"]

# Realiza a requisição GET para obter o conteúdo do iframe
iframe_response = requests.get(iframe_src)
iframe_content = iframe_response.content

# Utiliza o pandas para ler os dados do iframe
df = pd.read_html(iframe_content)[0]

# Exibe o DataFrame
print(df)

我必须能够提取所有数据,我想以CSV或XLSX文件的形式下载它们.

提前谢谢你

推荐答案

也许您可以模拟页面正在执行的加载数据的AJAX请求.下面是一个例子:

import pandas as pd
import requests


RT = []
api_url = "https://wabi-brazil-south-api.analysis.windows.net/public/reports/querydata?synchronous=true"


def get_query(RT):
    query = {
        "cancelQueries": [],
        "modelId": 4824238,
        "queries": [
            {
                "Query": {
                    "Commands": [
                        {
                            "SemanticQueryDataShapeCommand": {
                                "Binding": {
                                    "DataReduction": {
                                        "DataVolume": 3,
                                        "Primary": {
                                            "Window": {
                                                "Count": 500,
                                                "RestartTokens": RT,
                                            }
                                        },
                                    },
                                    "Primary": {
                                        "Groupings": [
                                            {"Projections": [0, 1, 2], "Subtotal": 1}
                                        ]
                                    },
                                    "Version": 1,
                                },
                                "ExecutionMetricsKind": 1,
                                "Query": {
                                    "From": [
                                        {
                                            "Entity": "RESULTADOS - DENATRAN 2 0",
                                            "Name": "r",
                                            "Type": 0,
                                        },
                                        {
                                            "Entity": "Tabela Mestre VE",
                                            "Name": "t",
                                            "Type": 0,
                                        },
                                    ],
                                    "OrderBy": [
                                        {
                                            "Direction": 1,
                                            "Expression": {
                                                "Column": {
                                                    "Expression": {
                                                        "SourceRef": {"Source": "r"}
                                                    },
                                                    "Property": "CIDADE",
                                                }
                                            },
                                        }
                                    ],
                                    "Select": [
                                        {
                                            "Column": {
                                                "Expression": {
                                                    "SourceRef": {"Source": "r"}
                                                },
                                                "Property": "CIDADE",
                                            },
                                            "Name": "RESULTADOS - DENATRAN 2 0.CIDADE",
                                        },
                                        {
                                            "Aggregation": {
                                                "Expression": {
                                                    "Column": {
                                                        "Expression": {
                                                            "SourceRef": {"Source": "r"}
                                                        },
                                                        "Property": "QTD",
                                                    }
                                                },
                                                "Function": 0,
                                            },
                                            "Name": "Sum(RESULTADOS - DENATRAN 2 0.QTD)",
                                        },
                                        {
                                            "Arithmetic": {
                                                "Left": {
                                                    "Aggregation": {
                                                        "Expression": {
                                                            "Column": {
                                                                "Expression": {
                                                                    "SourceRef": {
                                                                        "Source": "r"
                                                                    }
                                                                },
                                                                "Property": "QTD",
                                                            }
                                                        },
                                                        "Function": 0,
                                                    }
                                                },
                                                "Operator": 3,
                                                "Right": {
                                                    "ScopedEval": {
                                                        "Expression": {
                                                            "Aggregation": {
                                                                "Expression": {
                                                                    "Column": {
                                                                        "Expression": {
                                                                            "SourceRef": {
                                                                                "Source": "r"
                                                                            }
                                                                        },
                                                                        "Property": "QTD",
                                                                    }
                                                                },
                                                                "Function": 0,
                                                            }
                                                        },
                                                        "Scope": [],
                                                    }
                                                },
                                            },
                                            "Name": "Divide(Sum(RESULTADOS - DENATRAN 2 0.QTD), ScopedEval(Sum(RESULTADOS - DENATRAN 2 0.QTD), []))",
                                        },
                                    ],
                                    "Version": 2,
                                    "Where": [
                                        {
                                            "Condition": {
                                                "In": {
                                                    "Expressions": [
                                                        {
                                                            "Column": {
                                                                "Expression": {
                                                                    "SourceRef": {
                                                                        "Source": "r"
                                                                    }
                                                                },
                                                                "Property": "DT FROTA",
                                                            }
                                                        }
                                                    ],
                                                    "Values": [
                                                        [
                                                            {
                                                                "Literal": {
                                                                    "Value": "datetime'2023-07-01T00:00:00'"
                                                                }
                                                            }
                                                        ]
                                                    ],
                                                }
                                            }
                                        },
                                        {
                                            "Condition": {
                                                "In": {
                                                    "Expressions": [
                                                        {
                                                            "Column": {
                                                                "Expression": {
                                                                    "SourceRef": {
                                                                        "Source": "t"
                                                                    }
                                                                },
                                                                "Property": "TIPO VEICULO",
                                                            }
                                                        }
                                                    ],
                                                    "Values": [
                                                        [
                                                            {
                                                                "Literal": {
                                                                    "Value": "'CARROS'"
                                                                }
                                                            }
                                                        ]
                                                    ],
                                                }
                                            }
                                        },
                                    ],
                                },
                            }
                        }
                    ]
                },
                "QueryId": "",
            }
        ],
        "version": "1.0.0",
    }
    return query


headers = {"X-PowerBI-ResourceKey": "552598be-7e87-47c3-ae3e-a65dcaa097cf"}


def find_key_recursively(obj, key):
    if isinstance(obj, dict):
        if key in obj:
            yield obj[key]

        for v in obj.values():
            yield from find_key_recursively(v, key)
    elif isinstance(obj, list):
        for v in obj:
            yield from find_key_recursively(v, key)


all_data = []
while True:
    data = requests.post(api_url, json=get_query(RT), headers=headers).json()

    x = next(find_key_recursively(data, "DM1"))
    for d in x:
        if len(d["C"]) == 1:
            all_data.append((d["C"][0], 1))
        else:
            all_data.append((d["C"][0], d["C"][1]))

    try:
        RT = next(find_key_recursively(data, "RT"))
    except StopIteration:
        break

    print(RT)

df = pd.DataFrame(all_data, columns=["Name", "Count"])
print(df)

打印:


...

3112                     VOLTA REDONDA    217
3113                        VOTORANTIM     78
3114                       VOTUPORANGA     51
3115                            WAGNER      1
3116                    WENCESLAU BRAZ      2
3117               WENCESLAU GUIMARAES      1
3118                         WESTFALIA      4
3119                         WITMARSUM      1
3120                         XANGRI-LA     46
3121                           XANXERE     34
3122                            XAPURI      1
3123                             XAXIM     18
3124                          XINGUARA      3
3125                       XIQUE-XIQUE      6
3126                            ZABELE      1
3127                          ZACARIAS      2
3128                           ZE DOCA      4

Python相关问答推荐

Python在通过Inbox调用时给出不同的响应

在后台运行的Python函数

如何在Power Query中按名称和时间总和进行分组

Altair -箱形图边界设置为黑色,中线设置为红色

从 struct 类型创建MultiPolygon对象,并使用Polars列出[list[f64]列

使用Keras的线性回归参数估计

当多个值具有相同模式时返回空

发生异常:TclMessage命令名称无效.!listbox"

如何从在虚拟Python环境中运行的脚本中运行需要宿主Python环境的Shell脚本?

数据抓取失败:寻求帮助

Python键入协议默认值

Python—从np.array中 Select 复杂的列子集

多指标不同顺序串联大Pandas 模型

Python逻辑操作作为Pandas中的条件

寻找Regex模式返回与我当前函数类似的结果

Pandas:计算中间时间条目的总时间增量

合并与拼接并举

OpenCV轮廓.很难找到给定图像的所需轮廓

为什么调用函数的值和次数不同,递归在代码中是如何工作的?

为什么在FastAPI中创建与数据库的连接时需要使用生成器?