17.4。 concurrent.futures - 启动并行任务

版本3.2中的新功能。

源代码: Lib / concurrent / futures / thread.pyLib / concurrent / futures / process.py

concurrent.futures模块为异步执行可调用项提供了一个高级接口。

异步操作可以由线程实现(通过ThreadPoolExecutor),或者由进程实现(通过ProcessPoolExecutor)。两者实现了相同的接口,这些接口是定义在抽象类Executor中的。

17.4.1. Executor Objects

class concurrent.futures.Executor

一个抽象类,提供了异步执行调用的方法。它不应该直接使用,而是通过其具体的子类。

submit(fn, *args, **kwargs)

Schedules the callable, fn, to be executed as fn(*args **kwargs) and returns a Future object representing the execution of the callable.

with ThreadPoolExecutor(max_workers=1) as executor: future = executor.submit(pow, 323, 1235) print(future.result()) 
map(func, *iterables, timeout=None, chunksize=1)

Equivalent to map(func, *iterables) except func is executed asynchronously and several calls to func may be made concurrently. The returned iterator raises a concurrent.futures.TimeoutError if __next__() is called and the result isn’t available after timeout seconds from the original call to Executor.map(). timeout can be an int or a float. If timeout is not specified or None, there is no limit to the wait time. If a call raises an exception, then that exception will be raised when its value is retrieved from the iterator. When using ProcessPoolExecutor, this method chops iterables into a number of chunks which it submits to the pool as separate tasks. The (approximate) size of these chunks can be specified by setting chunksize to a positive integer. For very long iterables, using a large value for chunksize can significantly improve performance compared to the default size of 1. With ThreadPoolExecutor, chunksize has no effect.

Changed in version 3.5: Added the chunksize argument.

shutdown(wait=True)

Signal the executor that it should free any resources that it is using when the currently pending futures are done executing. Calls to Executor.submit() and Executor.map() made after shutdown will raise RuntimeError.

If wait is True then this method will not return until all the pending futures are done executing and the resources associated with the executor have been freed. If wait is False then this method will return immediately and the resources associated with the executor will be freed when all pending futures are done executing. Regardless of the value of wait, the entire Python program will not exit until all pending futures are done executing.

You can avoid having to call this method explicitly if you use the with statement, which will shutdown the Executor (waiting as if Executor.shutdown() were called with wait set to True):

import shutil with ThreadPoolExecutor(max_workers=4) as e: e.submit(shutil.copy, 'src1.txt', 'dest1.txt') e.submit(shutil.copy, 'src2.txt', 'dest2.txt') e.submit(shutil.copy, 'src3.txt', 'dest3.txt') e.submit(shutil.copy, 'src4.txt', 'dest4.txt') 

17.4.2. ThreadPoolExecutor

ThreadPoolExecutor是一个使用线程池异步执行调用的Executor子类。

当与Future相关联的可调用函数等待另一个Future的结果时,可能会发生死锁。例如:

import time
def wait_on_b():
    time.sleep(5)
    print(b.result())  # b will never complete because it is waiting on a.
    return 5

def wait_on_a():
    time.sleep(5)
    print(a.result())  # a will never complete because it is waiting on b.
    return 6


executor = ThreadPoolExecutor(max_workers=2)
a = executor.submit(wait_on_b)
b = executor.submit(wait_on_a)

和:

def wait_on_future():
    f = executor.submit(pow, 5, 2)
    # This will never complete because there is only one worker thread and
    # it is executing this function.
    print(f.result())

executor = ThreadPoolExecutor(max_workers=1)
executor.submit(wait_on_future)
class concurrent.futures.ThreadPoolExecutor(max_workers=None)

使用至多max_workers线程的池来异步执行调用的Executor子类。

在版本3.5中更改:如果max_workersNone或未指定,则默认为机器上的处理器数乘以5,假设ThreadPoolExecutor通常用于重叠I / O而不是CPU工作,并且工作程序的数量应该高于ProcessPoolExecutor

17.4.2.1. ThreadPoolExecutor Example

import concurrent.futures
import urllib.request

URLS = ['http://www.foxnews.com/',
        'http://www.cnn.com/',
        'http://europe.wsj.com/',
        'http://www.bbc.co.uk/',
        'http://some-made-up-domain.com/']

# Retrieve a single page and report the URL and contents
def load_url(url, timeout):
    with urllib.request.urlopen(url, timeout=timeout) as conn:
        return conn.read()

# We can use a with statement to ensure threads are cleaned up promptly
with concurrent.futures.ThreadPoolExecutor(max_workers=5) as executor:
    # Start the load operations and mark each future with its URL
    future_to_url = {executor.submit(load_url, url, 60): url for url in URLS}
    for future in concurrent.futures.as_completed(future_to_url):
        url = future_to_url[future]
        try:
            data = future.result()
        except Exception as exc:
            print('%r generated an exception: %s' % (url, exc))
        else:
            print('%r page is %d bytes' % (url, len(data)))

17.4.3. ProcessPoolExecutor

ProcessPoolExecutor类是使用进程池异步执行调用的Executor子类。ProcessPoolExecutor使用multiprocessing模块,它允许它侧移Global Interpreter Lock,但也意味着只有可拾取对象可以被执行,回。

__main__模块必须可以通过worker子进程进行导入。这意味着ProcessPoolExecutor在交互式解释器中不起作用。

从提交到ProcessPoolExecutor的可调用中调用ExecutorFuture方法将导致死锁。

class concurrent.futures.ProcessPoolExecutor(max_workers=None)

使用最多max_workers进程的池异步执行调用的Executor子类。如果max_workersNone或未给出,则它将默认为机器上的处理器数。如果max_workers低于或等于0,则会引发ValueError

在版本3.3中更改:当其中一个工作进程突然终止时,现在会出现BrokenProcessPool错误。以前,行为是未定义的,但对执行者或其未来的操作通常会冻结或死锁。

17.4.3.1. ProcessPoolExecutor Example

import concurrent.futures
import math

PRIMES = [
    112272535095293,
    112582705942171,
    112272535095293,
    115280095190773,
    115797848077099,
    1099726899285419]

def is_prime(n):
    if n % 2 == 0:
        return False

    sqrt_n = int(math.floor(math.sqrt(n)))
    for i in range(3, sqrt_n + 1, 2):
        if n % i == 0:
            return False
    return True

def main():
    with concurrent.futures.ProcessPoolExecutor() as executor:
        for number, prime in zip(PRIMES, executor.map(is_prime, PRIMES)):
            print('%d is prime: %s' % (number, prime))

if __name__ == '__main__':
    main()

17.4.4. Future Objects

Future类封装了可调用的异步执行。Future实例是由Executor.submit()创建的。

class concurrent.futures.Future

封装一个可调用的异步执行。Future实例由Executor.submit()创建,不应直接创建,除非测试。

cancel()

Attempt to cancel the call. If the call is currently being executed and cannot be cancelled then the method will return False, otherwise the call will be cancelled and the method will return True.

cancelled()

Return True if the call was successfully cancelled.

running()

Return True if the call is currently being executed and cannot be cancelled.

done()

Return True if the call was successfully cancelled or finished running.

result(timeout=None)

Return the value returned by the call. If the call hasn’t yet completed then this method will wait up to timeout seconds. If the call hasn’t completed in timeout seconds, then a concurrent.futures.TimeoutError will be raised. timeout can be an int or float. If timeout is not specified or None, there is no limit to the wait time.

If the future is cancelled before completing then CancelledError will be raised.

If the call raised, this method will raise the same exception.

exception(timeout=None)

Return the exception raised by the call. If the call hasn’t yet completed then this method will wait up to timeout seconds. If the call hasn’t completed in timeout seconds, then a concurrent.futures.TimeoutError will be raised. timeout can be an int or float. If timeout is not specified or None, there is no limit to the wait time.

If the future is cancelled before completing then CancelledError will be raised.

If the call completed without raising, None is returned.

add_done_callback(fn)

Attaches the callable fn to the future. fn will be called, with the future as its only argument, when the future is cancelled or finishes running.

Added callables are called in the order that they were added and are always called in a thread belonging to the process that added them. If the callable raises an Exception subclass, it will be logged and ignored. If the callable raises a BaseException subclass, the behavior is undefined.

If the future has already completed or been cancelled, fn will be called immediately.

以下Future方法适用于单元测试和Executor实施。

set_running_or_notify_cancel()

This method should only be called by Executor implementations before executing the work associated with the Future and by unit tests.

If the method returns False then the Future was cancelled, i.e. Future.cancel() was called and returned True. Any threads waiting on the Future completing (i.e. through as_completed() or wait()) will be woken up.

If the method returns True then the Future was not cancelled and has been put in the running state, i.e. calls to Future.running() will return True.

This method can only be called once and cannot be called after Future.set_result() or Future.set_exception() have been called.

set_result(result)

Sets the result of the work associated with the Future to result.

This method should only be used by Executor implementations and unit tests.

set_exception(exception)

Sets the result of the work associated with the Future to the Exception exception.

This method should only be used by Executor implementations and unit tests.

17.4.5. Module Functions

concurrent.futures.wait(fs, timeout=None, return_when=ALL_COMPLETED)

等待由fs给出的Future实例(可能由不同的Executor实例创建)完成。返回一个命名的2元组的集合。第一个集合,名为done,包含在等待完成之前完成(完成或被取消)的期货。第二个集合,名为not_done,包含未完成的期货。

timeout可用于控制返回前等待的最大秒数。timeout可以是一个int或float。如果未指定超时None,则等待时间没有限制。

return_when指示此函数应返回的时间。它必须是以下常量之一:

不变描述
FIRST_COMPLETED当任何未来完成或被取消时,该函数将返回。
FIRST_EXCEPTION当任何未来通过提出异常完成时,函数将返回。如果没有未来引发异常,则它等效于ALL_COMPLETED
ALL_COMPLETED当所有期货完成或被取消时,该函数将返回。
concurrent.futures.as_completed(fs, timeout=None)

在多个Future实例(可能由不同的Executor实例创建)上的迭代器(iterator)将会被返回,这些Future实例由fs完成(结束或者被停止)时产生(译者注:fs可以是隐含有一系列Future实例以及其他普通值的组合,返回的迭代器迭代的是那些Future实例,不包含普通的值)。fs给出的任何重复的期货将被返回一次。在调用as_completed()之前完成的任何期货将首先生成。返回的迭代器引发concurrent.futures.TimeoutError如果__next__()被调用,并且结果在从原始的超时秒后不可用调用as_completed()timeout可以是一个int或float。如果未指定超时None,则等待时间没有限制。

也可以看看

PEP 3148 - futures - 异步执行计算
描述此功能以包含在Python标准库中的提议。

17.4.6. Exception classes

exception concurrent.futures.CancelledError

在未来取消时引发。

exception concurrent.futures.TimeoutError

在未来操作超过给定超时时触发。

exception concurrent.futures.process.BrokenProcessPool

派生自RuntimeError,此异常类在ProcessPoolExecutor的工作程序之一以非干净方式终止时生成(例如,如果它从外部杀死)。

版本3.3中的新功能。