Python并发的意义:
在单线程执行的情况下,程序在执行IO密集型任务时(如网络请求,磁盘读写等),大部分时间都在等待I/O操作完成,而这些操作会阻塞线程,导致程序响应变慢。同时,如果是CPU密集型任务,单线程也不能发挥出多核CPU的优势。
而Python并发可以提高程序的运行效率,通过多线程/进程/协程的方式,可以让程序在执行IO操作时立即释放CPU,去执行其他任务,提高整个程序的执行效率,同时也能够发挥出多核CPU的优势,提高CPU密集型任务的执行速度。
Python并发的使用方法攻略:
1.多线程:
线程是操作系统能够进行运算调度的最小单位。在Python中,使用Threading模块实现多线程,其中Thread类用于创建线程对象。
示例1: 爬取多个网站的信息,将结果保存到一个文件中。
import threading
import requests
def get_page(url):
res = requests.get(url)
with open('result.txt', 'a') as f:
f.write(res.text)
if __name__ == '__main__':
urls = ['http://www.python.org', 'http://www.baidu.com', 'http://www.cnblogs.com']
threads = []
for url in urls:
t = threading.Thread(target=get_page, args=(url,))
threads.append(t)
for t in threads:
t.start()
for t in threads:
t.join()
示例2: 多线程进行数字的累加计算。
import threading
COUNT = 100000000
NUM_WORKERS = 10
def increment_count():
global COUNT
for i in range(int(COUNT/NUM_WORKERS)):
COUNT += 1
if __name__ == "__main__":
workers = []
for i in range(NUM_WORKERS):
worker = threading.Thread(target=increment_count)
workers.append(worker)
worker.start()
for worker in workers:
worker.join()
print("Final count: ", COUNT)
2.多进程:
进程是系统进行资源管理和调度的基本单位。通过multiprocessing模块实现多进程操作,其中Process类用于创建进程对象。
示例1: 利用多个进程并行下载文件。
import urllib.request
import time
import multiprocessing
def download(url):
filename = url.split('/')[-1]
urllib.request.urlretrieve(url, filename)
print(filename, 'downloaded.')
if __name__ == "__main__":
urls = ['http://www.bing.com', 'http://www.baidu.com', 'http://www.cnblogs.com']
p_list = []
for url in urls:
p = multiprocessing.Process(target=download, args=(url,))
p_list.append(p)
p.start()
for p in p_list:
p.join()
示例2: 多进程进行数字的累加计算。
from multiprocessing import Process, Lock, Value
COUNT = 100000000
NUM_WORKERS = 10
def increment_count(count, lock):
with lock:
for i in range(int(COUNT/NUM_WORKERS)):
count.value += 1
if __name__ == "__main__":
lock = Lock()
count = Value('i', 0)
workers = []
for i in range(NUM_WORKERS):
worker = Process(target=increment_count, args=(count, lock))
workers.append(worker)
worker.start()
for worker in workers:
worker.join()
print("Final count: ", count.value)
3.协程:
协程是一种用户态的轻量级线程,通过yield语句实现,在执行IO操作时可以暂停,去执行其他协程。
示例1: 使用gevent模块实现协程并发。
import gevent
from gevent import monkey; monkey.patch_socket()
import requests
def get_page(url):
res = requests.get(url)
print(res.text)
if __name__ == '__main__':
urls = ['http://www.python.org', 'http://www.baidu.com', 'http://www.cnblogs.com']
gevent.joinall([gevent.spawn(get_page, url) for url in urls])
示例2: 使用asyncio模块实现协程并发。
import asyncio
async def acquire_resource():
await asyncio.sleep(2)
return 1
async def use_resource(val):
await asyncio.sleep(1)
print(val)
if __name__ == "__main__":
loop = asyncio.get_event_loop()
tasks = [loop.create_task(acquire_resource()) for _ in range(5)]
results = loop.run_until_complete(asyncio.gather(*tasks))
tasks = [loop.create_task(use_resource(res)) for res in results]
loop.run_until_complete(asyncio.gather(*tasks))
以上三种方式都可以达到Python并发效果,但各有侧重。多线程适合I/O密集型任务;多进程适合CPU密集型任务;协程适合I/O密集型任务。具体应该根据任务类型和实际情况选取。