python爬虫之aiohttp多任务异步爬虫

55 阅读 0 评论 0 点赞

python爬虫之aiohttp多任务异步爬虫

爬取的flash服务如下：

from flask import Flask
import time

app = Flask(__name__)


@app.route('/bobo')
def index_bobo():
    time.sleep(2)
    return 'Hello bobo'

@app.route('/jay')
def index_jay():
    time.sleep(2)
    return 'Hello jay'

@app.route('/tom')
def index_tom():
    time.sleep(2)
    return 'Hello tom'

if __name__ == '__main__':
    app.run(threaded=True)

运行启动flask服务后，多任务爬取代码如下：

#环境安装：pip install aiohttp
#使用该模块中的ClientSession
import requests
import asyncio
import time
import aiohttp

start = time.time()
urls = [
    'http://127.0.0.1:5000/bobo','http://127.0.0.1:5000/jay','http://127.0.0.1:5000/tom'
]

async def get_page(url):
    async with aiohttp.ClientSession() as session:
        #get()、post():
        #headers,params/data,proxy='http://ip:port'
        async with await session.get(url) as response:
            #text()返回字符串形式的响应数据
            #read()返回二进制形式的响应数据
            #json()返回的就是json对象
            #注意：获取响应数据操作之前一定要使用await进行手动挂起
            page_text = await response.text()
            print(page_text)

tasks = []

for url in urls:
    c = get_page(url)
    task = asyncio.ensure_future(c)
    tasks.append(task)

loop = asyncio.get_event_loop()
loop.run_until_complete(asyncio.wait(tasks))

end = time.time()

print('总耗时：',end-start)

本站资源均来自互联网，仅供研究学习，禁止违法使用和商用，产生法律纠纷本站概不负责！如果侵犯了您的权益请与我们联系！

转载请注明出处：免费源码网-免费的源码资源网站 » python爬虫之aiohttp多任务异步爬虫

点赞(0) 打赏

本文分类：文章资讯
本文标签：python爬虫之aiohttp多任务异步爬虫
浏览次数：55 次浏览
本文链接：https://freeymw.com/article/12944.html

上一篇 > 如何选择适合的LabVIEW版本进行开发
下一篇 > Flask快速入门（路由、CBV、请求和响应、session）

评论列表共有 0 条评论

暂无评论

python爬虫之aiohttp多任务异步爬虫

python爬虫之aiohttp多任务异步爬虫

评论列表 共有 0 条评论

发表评论 取消回复

评论列表共有 0 条评论

发表评论取消回复