我有一个可以异步发出http请求的python脚本,提供了一系列URL,下面是示例代码.

#!/usr/local/bin/python3
import httpx
import asyncio

async def httpx_get(client, url):
    resp1 = await client.get(url, timeout=5, allow_redirects=True) 
    # ... other computation ...
    resp2 = await client.get(url, timeout=5, allow_redirects=True) # what happens during this second await?

async def fetch_pages(urls):
    async with httpx.AsyncClient(verify=False) as client:
        # create task for all URLs
        coros = []
        for url in urls:
            coros.append(asyncio.create_task(httpx_get(client, url)))

        # await all tasks
        for coro in asyncio.as_completed(coros):
            resp = await coro 

def main():
    urls = [ "https://google.com", "https://yahoo.com", "https://youtube.com" ] 
    asyncio.run(fetch_pages(urls))

if __name__ == "__main__":
    main()

我意识到我对Asyncio何时处理我httpx_get函数中的第二个等待,resp2 = await client.get(..,缺乏理解.

因为for coro in asyncio.as_completed(coros): resp = await coro只 for each 协程解析第一个等待resp1 = await client.get,所以第二个等待resp2 = await client.get什么时候得到解析?

第二个等待/请求是否同时执行和等待?还是发生了其他事情?如有任何澄清,我们将不胜感激,谢谢!

推荐答案

...当asyncio将处理第二个等待时,在我的HTTPX_GET函数中,res2=await client.get(..).

在第一次等待时,事件循环将暂停该任务,并继续执行其他操作,最终判断是否返回client.get.如果有,则该任务将继续执行,直到它遇到第二个等待resp2 =await...,Eventloop像第一个一样处理该第二个await-它挂起该任务并继续做其他事情,直到它判断协程是否已经返回.


这里有一个用时间延迟代替检索url的例子--希望它是类似的,并将显示您的流程的情况可能如何发展.

import asyncio,random,time

async def httpx_get(client, url):
    print(f'{time.time()} task:{client} resp1 will be called')
    resp1 = await asyncio.sleep(random.randrange(1,10))
    # ... other computation ...
    print(f'{time.time()} task:{client} resp1 returned')
    print(f'{time.time()} task:{client} executing')
    print(f'{time.time()} task:{client} executing')
    print(f'{time.time()} task:{client} executing')
    print(f'{time.time()} task:{client} executing')
    print(f'{time.time()} task:{client} resp2 will be called')
    resp2 = await asyncio.sleep(random.randrange(1,10))
    print(f'{time.time()}  task:{client} resp2 returned')
    return f'{time.time()} task:{client} finished - url'

async def fetch_pages(urls):
    # create task for all URLs
    coros = []
    for n,url in enumerate(urls):
        coros.append(asyncio.create_task(httpx_get(n, url)))

    # await all tasks
    for coro in asyncio.as_completed(coros):
        resp = await coro 
        print(f'{time.time()} {resp}')

def main():
    urls = [ "https://google.com", "https://yahoo.com", "https://youtube.com" ] 
    asyncio.run(fetch_pages(urls))

if __name__ == "__main__":
    main()

执行死刑的结果:

1659744155.1480112 task:0 resp1 will be called and awaited
1659744155.1480112 task:1 resp1 will be called and awaited
1659744155.1480112 task:2 resp1 will be called and awaited
1659744161.152556 task:2 resp1 returned
1659744161.152556 task:2 executing
1659744161.152556 task:2 executing
1659744161.152556 task:2 executing
1659744161.154198 task:2 executing
1659744161.154198 task:2 resp2 will be called and awaited
1659744163.1660972 task:1 resp1 returned
1659744163.1660972 task:1 executing
1659744163.1660972 task:1 executing
1659744163.1660972 task:1 executing
1659744163.1660972 task:1 executing
1659744163.1675234 task:1 resp2 will be called and awaited
1659744164.1542242 task:0 resp1 returned
1659744164.1542242 task:0 executing
1659744164.1542242 task:0 executing
1659744164.1542242 task:0 executing
1659744164.155838 task:0 executing
1659744164.155838 task:0 resp2 will be called and awaited
1659744166.167943  task:2 resp2 returned
1659744166.167943 1659744166.167943 task:2 finished - https://youtube.com
1659744168.1679864  task:1 resp2 returned
1659744168.1685212 1659744168.1685212 task:1 finished - https://yahoo.com
1659744169.1565957  task:0 resp2 returned
1659744169.1565957 1659744169.1565957 task:0 finished - https://google.com

Python相关问答推荐

我从带有langchain的mongoDB中的vector serch获得一个空数组

Pandas 滚动最接近的价值

使用miniconda创建环境的问题

发生异常:TclMessage命令名称无效.!listbox"

根据二元组列表在pandas中创建新列

通过pandas向每个非空单元格添加子字符串

如何使用Python以编程方式判断和检索Angular网站的动态内容?

所有列的滚动标准差,忽略NaN

python中csv. Dictreader. fieldname的类型是什么?'

如何在两列上groupBy,并使用pyspark计算每个分组列的平均总价值

导入错误:无法导入名称';操作';

Python避免mypy在相互引用中从另一个类重定义类时失败

具有相同图例 colored颜色 和标签的堆叠子图

无法在Spyder上的Pandas中将本地CSV转换为数据帧

使用SeleniumBase保存和加载Cookie时出现问题

Pandas在rame中在组内洗牌行,保持相对组的顺序不变,

ModuleNotFoundError:Python中没有名为google的模块''

对数据帧进行分组,并按组间等概率抽样n行

递归链表反转与打印语句挂起

Pandas ,快速从词典栏中提取信息到新栏