同步提交,异步提交


提交任务的两种方式:
  同步调用:提交完一个任务之后,就在原地等待,等待任务完完整整地运行完毕拿到结果后,再执行下一行代码,会导致任务是串行执行
from concurrent.futures import ThreadPoolExecutor,ProcessPoolExecutor
import time,random,os

def task(name,n):
    print('%s%s is running' %(name,os.getpid()))
    time.sleep(random.randint(1,3))
    return n**2

if __name__ == '__main__':
    # print(os.cpu_count())  #查看cpu的个数
    p=ProcessPoolExecutor(4)

    for i in range(10):
        # 同步提交
        res=p.submit(task,'进程pid: ',i).result()
        print(res)
    print("")

结果:
进程pid: 10720 is running
0
进程pid: 10724 is running
1
进程pid: 5948 is running
4
进程pid: 2068 is running
9
进程pid: 10720 is running
16
进程pid: 10724 is running
25
进程pid: 5948 is running
36
进程pid: 2068 is running
49
进程pid: 10720 is running
64
进程pid: 10724 is running
81
同步提交
  异步调用:提交完一个任务之后,不在原地等待,结果直接执行下一行代码,会导致任务是并发执行的
from concurrent.futures import ThreadPoolExecutor,ProcessPoolExecutor
import time,random,os

def task(name,n):
    print('%s%s is running' %(name,os.getpid()))
    time.sleep(random.randint(1,3))
    return n**2

if __name__ == '__main__':
    p=ProcessPoolExecutor(4)
    l = []
    for i in range(10):
        # 异步提交
        future = p.submit(task, '进程pid: ', i)
        l.append(future)
    p.shutdown(wait=True) #shutdown关闭进程池入口(不能将任务放入进程池)并且在原地等待进程池内所有任务运行完毕

    for future in l:
        print(future.result())
    print("")

结果:
进程pid: 10956 is running
进程pid: 11040 is running
进程pid: 10552 is running
进程pid: 11332 is running
进程pid: 10552 is running
进程pid: 11040 is running
进程pid: 10552 is running
进程pid: 10956 is running
进程pid: 11332 is running
进程pid: 10956 is running
0
1
4
9
16
25
36
49
64
81
异步提交

案例:

#进程池并发爬取网站
from concurrent.futures import ProcessPoolExecutor
import time,os
import requests


def get(url):
    print('%s GET %s' %(os.getpid(),url))
    time.sleep(3)
    response=requests.get(url)
    if response.status_code == 200:
        res=response.text
    else:
        res='下载失败'
    parse(res)

def parse(res):
    time.sleep(1)
    print('%s 解析结果为%s' %(os.getpid(),len(res)))

if __name__ == '__main__':
    urls=[
        'https://www.baidu.com',
        'https://www.sina.com.cn',
        'https://www.tmall.com',
        'https://www.jd.com',
        'https://www.python.org',
        'https://www.openstack.org',
        'https://www.baidu.com',
        'https://www.baidu.com',
        'https://www.baidu.com',

    ]

    p=ProcessPoolExecutor(9)
    l=[]
    start=time.time()
    for url in urls:
        future=p.submit(get,url)
        l.append(future)
    p.shutdown(wait=True)

    print('',time.time()-start)
结果:
11952 GET https://www.baidu.com
11992 GET https://www.sina.com.cn
7136 GET https://www.tmall.com
11984 GET https://www.jd.com
11948 GET https://www.python.org
5952 GET https://www.openstack.org
12056 GET https://www.baidu.com
11128 GET https://www.baidu.com
11728 GET https://www.baidu.com
11952 解析结果为2443
11992 解析结果为578360
7136 解析结果为217570
11984 解析结果为90905
12056 解析结果为2443
11128 解析结果为2443
11728 解析结果为2443
11948 解析结果为48413
5952 解析结果为66284
主 10.874621868133545
用进程池爬取并发爬取网站
from concurrent.futures import ProcessPoolExecutor
import time,os
import requests

#并发下载,异步提交
def get(url):
    print('%s GET %s' %(os.getpid(),url))
    time.sleep(3)
    response=requests.get(url)
    if response.status_code == 200:
        res=response.text
    else:
        res='下载失败'
    return res

def parse(future):
    time.sleep(1)
    res=future.result()
    print('%s 解析结果为%s' %(os.getpid(),len(res)))

if __name__ == '__main__':
    urls=[
        'https://www.baidu.com',
        'https://www.sina.com.cn',
        'https://www.tmall.com',
        'https://www.jd.com',
        'https://www.python.org',
        'https://www.openstack.org',
        'https://www.baidu.com',
        'https://www.baidu.com',
        'https://www.baidu.com',

    ]

    p=ProcessPoolExecutor(9)

    start=time.time()
    for url in urls:
        future=p.submit(get,url)
        # 异步调用:提交完一个任务之后,不在原地等待,而是直接执行下一行代码,会导致任务是并发执行的,,结果futrue对象会在任务运行完毕后自动传给回调函数
        future.add_done_callback(parse)  #parse会在任务运行完毕后自动触发,然后接收一个参数future对象
        #add_done_callback 添加一个回调函数
    p.shutdown(wait=True)

    print('',time.time()-start)
    print('',os.getpid())

结果:
10056 GET https://www.baidu.com
11584 GET https://www.sina.com.cn
3624 GET https://www.tmall.com
12116 GET https://www.jd.com
10072 GET https://www.python.org
9784 GET https://www.openstack.org
11924 GET https://www.baidu.com
10272 GET https://www.baidu.com
3084 GET https://www.baidu.com
11744 解析结果为2443
11744 解析结果为217570
11744 解析结果为90981
11744 解析结果为2443
11744 解析结果为2443
11744 解析结果为2443
11744 解析结果为66304
11744 解析结果为578616
11744 解析结果为48181
主 24.257058382034311744
进程池异步爬取网站2
from concurrent.futures import ThreadPoolExecutor
from threading import current_thread
import time,requests



def get(url):
    print('%s GET %s' %(current_thread().name,url))
    time.sleep(3)
    response=requests.get(url)
    if response.status_code == 200:
        res=response.text
    else:
        res='下载失败'
    return res

def parse(future):
    time.sleep(1)
    res=future.result()
    print('%s 解析结果为%s' %(current_thread().name,len(res)))

if __name__ == '__main__':
    urls=[
        'https://www.baidu.com',
        'https://www.sina.com.cn',
        'https://www.tmall.com',
        'https://www.jd.com',
        'https://www.python.org',
        'https://www.openstack.org',
        'https://www.baidu.com',
        'https://www.baidu.com',
        'https://www.baidu.com',

    ]

    p=ThreadPoolExecutor(4)

    for url in urls:
        future=p.submit(get,url)
        future.add_done_callback(parse)

    p.shutdown(wait=True)

    print('',current_thread().name)

#阻塞:遇到io行为,进入阻塞状态,等待一会。

结果:
ThreadPoolExecutor-0_0 GET https://www.baidu.com
ThreadPoolExecutor-0_1 GET https://www.sina.com.cn
ThreadPoolExecutor-0_2 GET https://www.tmall.com
ThreadPoolExecutor-0_3 GET https://www.jd.com

ThreadPoolExecutor-0_3 解析结果为90936
ThreadPoolExecutor-0_3 GET https://www.python.org
ThreadPoolExecutor-0_0 解析结果为2443
ThreadPoolExecutor-0_0 GET https://www.openstack.org
ThreadPoolExecutor-0_2 解析结果为217570
ThreadPoolExecutor-0_2 GET https://www.baidu.com
ThreadPoolExecutor-0_1 解析结果为578616
ThreadPoolExecutor-0_1 GET https://www.baidu.com

ThreadPoolExecutor-0_2 解析结果为2443
ThreadPoolExecutor-0_2 GET https://www.baidu.com
ThreadPoolExecutor-0_1 解析结果为2443
ThreadPoolExecutor-0_0 解析结果为66304
ThreadPoolExecutor-0_3 解析结果为48181

ThreadPoolExecutor-0_2 解析结果为2443
主 MainThread
线程池爬取网站

优质内容筛选与推荐>>
1、ASP.NETCore2.0:六.举个例子来聊聊它的依赖注入
2、10个影响程序性能的Hibernate错误,学会让你少走弯路
3、APT防御的他山石—思科内部安全团队解读APT
4、当深度学习老司机遇到乐高积木
5、记BCTF之旅—真假难辨


长按二维码向我转账

受苹果公司新规定影响,微信 iOS 版的赞赏功能被关闭,可通过二维码转账支持公众号。

    阅读
    好看
    已推荐到看一看
    你的朋友可以在“发现”-“看一看”看到你认为好看的文章。
    已取消,“好看”想法已同步删除
    已推荐到看一看 和朋友分享想法
    最多200字,当前共 发送

    已发送

    朋友将在看一看看到

    确定
    分享你的想法...
    取消

    分享想法到看一看

    确定
    最多200字,当前共

    发送中

    网络异常,请稍后重试

    微信扫一扫
    关注该公众号





    联系我们

    欢迎来到TinyMind。

    关于TinyMind的内容或商务合作、网站建议,举报不良信息等均可联系我们。

    TinyMind客服邮箱:support@tinymind.net.cn