关注我们

Python3 - 多进程

多进程是系统并行运行一个或多个进程的能力。简而言之，多进程使用单个计算机系统中的两个或多个CPU。此方法还能够在多个进程之间分配任务。

处理单元共享主存储器和外围设备以同时处理程序。多进程应用程序分成较小的部分，并独立运行。操作系统将每个进程分配给处理器。

Python 提供了称为multiprocessing的内置软件包，该软件包支持交换进程。在进行多进程之前，无涯教程必须了解进程对象。

为什么要多进程？

多进程对于在计算机系统内执行多项任务至关重要。假设一台没有多进程器或单处理器的计算机。同时将各种进程分配给该系统。

然后，它将不得不中断上一个任务，并转移到另一个任务以使所有进程继续进行。就像厨师独自在厨房工作一样简单。他必须完成一些任务来烹饪食物，例如切割，清洁，烹饪，揉面团，烘烤等。

因此，多进程对于不间断地同时执行多个任务至关重要。它还使跟踪所有任务变得容易。因此，出现了多进程的概念。

多进程可以表示为一台具有多个中央处理器的计算机。
多核处理器是指具有两个或更多独立单元的单个计算组件。

在多进程中，CPU可以一次分配多个任务，每个任务都有自己的处理器。

Python多进程

Python提供了多进程模块，可以在单个系统中执行多个任务。它提供了一个用户友好且直观的API，可用于多进程。

了解多重处理的简单示例。

复制代码

from multiprocessing import Process
   def disp():
      print ('Hello !! Welcome to Python Tutorial')
      if __name__ == '__main__':
      p = Process(target=disp)
      p.start()
      p.join()

输出:

'Hello !! Welcome to Python Tutorial'

在上面的代码中，导入了Process类，然后在 disp()函数中创建了Process对象。然后，使用 start()方法开始该进程，并使用 join()方法完成该进程。还可以使用 args 关键字在声明的函数中传递参数。

在线运行复制代码

# Python multiprocessing example
# importing the multiprocessing module

import multiprocessing
def cube(n):
   # 此功能将打印给定数量的多维数据集
   print("The Cube is: {}".format(n * n * n))

def square(n):
    # 此功能将打印给定数量的正方形
   print("The Square is: {}".format(n * n))

if __name__ == "__main__":
   # 创建两个进程
   process1 = multiprocessing.Process(target= square, args=(5, ))
   process2 = multiprocessing.Process(target= cube, args=(5, ))

   # 在这里，我们开始这个进程1
   process1.start()
   # 在这里，我们开始进程2
   process2.start()

   # join()方法用于等待进程1完成
   process1.join()
   # 它用于等待进程1完成
   process2.join()

   # 如果两个进程都完成，请打印
   print("Both processes are finished")

输出:

The Cube is: 125
The Square is: 25
Both processes are finished

在上面的示例中，创建了两个函数- cube()函数计算给定数字的立方体，而 square()函数计算给定数字的平方。

接下来，定义具有两个参数的Process类的进程对象。第一个参数是 target ，代表要执行的函数，第二个参数是 args，代表在函数内传递的参数。

process1 = multiprocessing.Process(target= square, args=(5, ))
process2 = multiprocessing.Process(target= cube, args=(5, ))

已经使用了 start()方法来启动该进程。

process1.start()
process2.start()

正如在输出中看到的那样，它等待完成process1 ，然后完成process2 。在两个进程完成之后，执行最后一条语句。

多进程类

Python多进程模块提供了许多用于构建并行程序的类。将讨论其主要类-进程，队列和锁定。在前面的示例中，已经讨论了Process类。现在，将讨论Queue和Lock类。

看一下获取当前系统中CPU数量的简单示例。

链接：https://www.learnfk.comhttps://www.learnfk.com/python3/python-multiprocessing.html

来源：LearnFk无涯教程网

在线运行复制代码

import multiprocessing
print("The number of CPU currently working in system : ", multiprocessing.cpu_count())

输出:

('The number of CPU currently woking in system : ', 32)

以上CPU数量可能因您的PC而异。

队列多进程

知道Queue是数据结构的重要组成部分。 Python多进程与基于"先进先出"概念的数据结构队列完全相同。队列通常存储Python对象，并且在进程之间共享数据中起着至关重要的作用。

在流程的目标函数中将队列作为参数传递，以允许流程使用数据。队列提供 put()函数以插入数据，并提供 get()函数以从队列中获取数据。

在线运行复制代码

# Importing Queue Class

from multiprocessing import Queue

fruits = ['Apple', 'Orange', 'Guava', 'Papaya', 'Banana']
count = 1
# creating a queue object
queue = Queue()
print('pushing items to the queue:')
for fr in fruits:
    print('item no: ', count, ' ', fr)
    queue.put(fr)
    count += 1

print('\npopping items from the queue:')
count = 0
while not queue.empty():
    print('item no: ', count, ' ', queue.get())
    count += 1

输出:

pushing items to the queue:
('item no: ', 1, ' ', 'Apple')
('item no: ', 2, ' ', 'Orange')
('item no: ', 3, ' ', 'Guava')
('item no: ', 4, ' ', 'Papaya')
('item no: ', 5, ' ', 'Banana')

popping items from the queue:
('item no: ', 0, ' ', 'Apple')
('item no: ', 1, ' ', 'Orange')
('item no: ', 2, ' ', 'Guava')
('item no: ', 3, ' ', 'Papaya')
('item no: ', 4, ' ', 'Banana')

在上面的代码中，无涯教程导入了 Queue 类并初始化了名为Fruits的列表。接下来，将 count 分配给1。count变量将计算元素总数。然后，通过调用 Queue()方法来创建队列对象。该对象将用于在队列中执行操作。在for循环中使用 put()函数将元素逐个插入队列，并在每次循环迭代时将计数增加1。

多进程锁

多进程Lock类用于获取对该进程的锁定，以便可以让另一个进程执行类似的代码，直到释放该锁定为止。 Lock类主要执行两项任务。第一种是使用 acquire()函数来获取锁，第二种是使用 release()函数来释放锁。

假设有多个任务。因此，创建了两个队列:第一个队列将维护任务，另一个将存储完整的任务日志。下一步是实例化流程以完成任务。如前所述，Queue类已经同步，因此不需要使用Lock类来获取锁。

在下面的示例中，将所有多进程类合并在一起。

在线运行复制代码

from multiprocessing import Lock, Process, Queue, current_process
import time
import queue 


def jobTodo(tasks_to_perform, complete_tasks):
    while True:
        try:

            # The try block to catch task from the queue.
            # The get_nowait() function is used to
            # raise queue.Empty exception if the queue is empty.

            task = tasks_to_perform.get_nowait()

        except queue.Empty:

            break
        else:

                # if no exception has been raised, the else block will execute
                # add the task completion
                

            print(task)
            complete_tasks.put(task + ' is done by ' + current_process().name)
            time.sleep(.5)
    return True


def main():
    total_task = 8
    total_number_of_processes = 3
    tasks_to_perform = Queue()
    complete_tasks = Queue()
    number_of_processes = []

    for i in range(total_task):
        tasks_to_perform.put("Task no " + str(i))

    # defining number of processes
    for w in range(total_number_of_processes):
        p = Process(target=jobTodo, args=(tasks_to_perform, complete_tasks))
        number_of_processes.append(p)
        p.start()

    # completing process
    for p in number_of_processes:
        p.join()

    # print the output
    while not complete_tasks.empty():
        print(complete_tasks.get())

    return True


if __name__ == '__main__':
    main()

输出:

Task no 2
Task no 5
Task no 0
Task no 3
Task no 6
Task no 1
Task no 4
Task no 7
Task no 0 is done by Process-1
Task no 1 is done by Process-3
Task no 2 is done by Process-2
Task no 3 is done by Process-1
Task no 4 is done by Process-3
Task no 5 is done by Process-2
Task no 6 is done by Process-1
Task no 7 is done by Process-3

多进程池

Python多进程池对于跨多个输入值并行执行函数至关重要。它还可用于跨进程(数据并行性)分配输入数据。考虑以下多进程池示例。

在线运行复制代码

from multiprocessing import Pool
import time

w = (["V", 5], ["X", 2], ["Y", 1], ["Z", 3])


def work_log(data_for_work):
    print(" Process name is %s waiting time is %s seconds" % (data_for_work[0], data_for_work[1]))
    time.sleep(int(data_for_work[1]))
    print(" Process %s Executed." % data_for_work[0])


def handler():
    p = Pool(2)
    p.map(work_log, w)

if __name__ == '__main__':
    handler()

输出:

Process name is V waiting time is 5 seconds
Process V Executed.
Process name is X waiting time is 2 seconds
Process X Executed.
Process name is Y waiting time is 1 seconds
Process Y Executed.
Process name is Z waiting time is 3 seconds
Process Z Executed.

了解多进程池的另一个示例。

无涯教程网

在线运行复制代码

from multiprocessing import Pool
def fun(x):
    return x*x

if __name__ == '__main__':
    with Pool(5) as p:
        print(p.map(fun, [1, 2, 3]))

输出:

[1, 8, 27]

代理对象

代理对象称为驻留在不同进程中的共享对象。该对象也称为代理。多个代理对象可能具有相似的引用。代理对象由各种方法组成，这些方法用于调用其引用对象的相应方法。以下是代理对象的示例。

在线运行复制代码

from multiprocessing import Manager
manager = Manager()
l = manager.list([i*i for i in range(10)])
print(l)
print(repr(l))
print(l[4])
print(l[2:5])

输出:

[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]
<ListProxy object, typeid 'list' at 0x7f063621ea10>
16
[4, 9, 16]

代理对象是可拾取的，因此可以在进程之间传递它们。这些对象还用于控制同步级别。

常用函数

到目前为止，无涯教程已经讨论了使用Python进行多进程的基本概念。多进程本身就是一个广泛的主题，对于在单个系统中执行各种任务至关重要，这些函数通常用于实现多进程。

方法	说明
pipe()	pipe()函数返回一对连接对象。
run()	run()方法用于表示流程活动。
start()	start()方法用于启动进程。
join([timeout])	join()方法用于阻塞进程，直到调用join()方法的进程终止为止。超时是可选参数。
is_alive()	如果进程仍然存在，它将返回。
terminate()	顾名思义用于终止进程。永远记住- terminate()方法在Linux中使用，对于Windows，使用 TerminateProcess()方法。
kill()	此方法类似于 terminate()，但在Unix上使用SIGKILL信号。
close()	此方法用于关闭 Process 对象并释放与之关联的所有资源。
qsize()	它返回队列的大概大小。
empty()	如果队列为空，则返回 True 。
full()	如果队列已满，则返回 True 。
get_await()	此方法等效于 get(False)。
get()	此方法用于从队列中获取元素。它从队列中删除并返回一个元素。
put()	此方法用于将元素插入队列。
cpu_count()	它返回系统中正在工作的CPU的数量。
current_process()	它返回与当前进程相对应的Process对象。
parent_process()	它返回与当前进程相对应的父Process对象。
task_done()	此函数用于指示已排队的任务已完成。
join_thread()	此方法用于加入后台线程

祝学习愉快！(内容编辑有误？请选中要编辑内容 -> 右键 -> 修改 -> 提交！)

技术教程推荐

Python自动化办公实战课 -〔尹会生〕

HarmonyOS快速入门与实战 -〔QCon+案例研习社〕

Web漏洞挖掘实战 -〔王昊天〕

朱涛 · Kotlin编程第一课 -〔朱涛〕

说透低代码 -〔陈旭〕

好记忆不如烂笔头。留下您的足迹吧 :)