我遇到了一个Python子进程问题,我在Python 3.6和3.7上复制了这个问题,但我不理解.我有一个程序,叫做Main,它使用子进程启动一个外部进程.Popen(),称之为"奴隶".主程序注册一个SIGTERM信号处理器.Main使用任一进程等待从进程完成.等待(无)或进程.等待(超时).从进程可以通过向主进程发送SIGTERM信号来中断.sigterm处理程序将向从机发送SIGINT信号,并等待(30)信号终止.如果Main正在使用wait(无),那么sigterm处理程序的wait(30)将等待整整30秒,即使从进程已终止.如果Main使用的是wait(超时)版本,那么sigterm处理程序的wait(30)将在从机终止后立即返回.

下面是一个小的测试应用程序,演示了这个问题.通过python wait_test.py运行,使用非超时等待(无).通过python wait_test.py <timeout value>运行它,为主等待提供特定的超时.

程序运行后,执行kill -15 <pid>并查看应用程序的react .

#
# Save this to a file called wait_test.py
#
import signal
import subprocess
import sys
from datetime import datetime

slave_proc = None


def sigterm_handler(signum, stack):
    print("Process received SIGTERM signal {} while processing job!".format(signum))
    print("slave_proc is {}".format(slave_proc))

    if slave_proc is not None:
        try:
            print("{}: Sending SIGINT to slave.".format(datetime.now()))
            slave_proc.send_signal(signal.SIGINT)
            slave_proc.wait(30)
            print("{}: Handler wait completed.".format(datetime.now()))
        except subprocess.TimeoutExpired:
            slave_proc.terminate()
        except Exception as exception:
            print('Sigterm Exception: {}'.format(exception))
            slave_proc.terminate()
            slave_proc.send_signal(signal.SIGKILL)


def main(wait_val=None):
    with open("stdout.txt", 'w+') as stdout:
        with open("stderr.txt", 'w+') as stderr:
            proc = subprocess.Popen(["python", "wait_test.py", "slave"],
                                    stdout=stdout,
                                    stderr=stderr,
                                    universal_newlines=True)

    print('Slave Started')

    global slave_proc
    slave_proc = proc

    try:
        proc.wait(wait_val)    # If this is a no-timeout wait, ie: wait(None), then will hang in sigterm_handler.
        print('Slave Finished by itself.')
    except subprocess.TimeoutExpired as te:
        print(te)
        print('Slave finished by timeout')
        proc.send_signal(signal.SIGINT)
        proc.wait()

    print("Job completed")


if __name__ == '__main__':
    if len(sys.argv) > 1 and sys.argv[1] == 'slave':
        while True:
            pass

    signal.signal(signal.SIGTERM, sigterm_handler)
    main(int(sys.argv[1]) if len(sys.argv) > 1 else None)
    print("{}: Exiting main.".format(datetime.now()))

以下是两次运行的示例:

Note here the 30 second delay
--------------------------------
[mkurtz@localhost testing]$ python wait_test.py
Slave Started
Process received SIGTERM signal 15 while processing job!
slave_proc is <subprocess.Popen object at 0x7f79b50e8d90>
2022-03-30 11:08:15.526319: Sending SIGINT to slave.   <--- 11:08:15
Slave Finished by itself.
Job completed
2022-03-30 11:08:45.526942: Exiting main.              <--- 11:08:45


Note here the instantaneous shutdown
-------------------------------------
[mkurtz@localhost testing]$ python wait_test.py 100
Slave Started
Process received SIGTERM signal 15 while processing job!
slave_proc is <subprocess.Popen object at 0x7fa2412a2dd0>
2022-03-30 11:10:03.649931: Sending SIGINT to slave.   <--- 11:10:03.649
2022-03-30 11:10:03.653170: Handler wait completed.    <--- 11:10:03.653
Slave Finished by itself.
Job completed
2022-03-30 11:10:03.673234: Exiting main.              <--- 11:10:03.673

这些特定测试是在CentOS 7上使用Python 3.7.9运行的.

推荐答案

Popen班有internal lock for wait operations分:

        # Held while anything is calling waitpid before returncode has been
        # updated to prevent clobbering returncode if wait() or poll() are
        # called from multiple threads at once.  After acquiring the lock,
        # code must re-check self.returncode to see if another thread just
        # finished a waitpid() call.
        self._waitpid_lock = threading.Lock()

wait() and wait(timeout=...)之间的主要区别在于前者无限期地等待while holding the lock,而后者是一个繁忙的循环,其等待时间为releases the lock on each iteration.

            if timeout is not None:
                ...
                while True:
                    if self._waitpid_lock.acquire(False):
                        try:
                            ...
                            # wait without any delay
                            (pid, sts) = self._try_wait(os.WNOHANG)
                            ...
                        finally:
                            self._waitpid_lock.release()
                    ...
                    time.sleep(delay)
            else:
                while self.returncode is None:
                    with self._waitpid_lock:  # acquire lock unconditionally
                        ...
                        # wait indefinitley
                        (pid, sts) = self._try_wait(0)

对于常规并发代码(即threading)来说,这不是问题,因为运行wait()并持有锁的线程将在子进程完成后立即被唤醒.这反过来又允许等待锁/子进程的所有其他线程迅速进行.


然而,当a)main线程在wait()中持有锁,b)a signal handlertry 等待时,情况就不同了.信号处理程序的一个微妙之处是它们会中断主线程:

signal: Signals and Threads

Python信号处理程序总是在主解释器的主Python线程中执行,即使信号是在另一个线程中接收到的.[…]

由于信号处理程序在主线程中运行,因此主线程的常规代码执行将暂停,直到信号处理程序完成!

通过在信号处理程序中运行wait,a)信号处理程序阻塞等待锁,b)锁阻塞等待信号处理程序.只有当信号处理程序wait次超时时,"主线程"才会恢复,接收到超级进程完成的确认,设置返回代码并释放锁.

Python相关问答推荐

如何将Matplotlib的fig.add_axes本地坐标与我的坐标关联起来?

pyautogui.locateOnScreen在Linux上的工作方式有所不同

强制venv在bin而不是收件箱文件夹中创建虚拟环境

通过仅导入pandas来在for循环中进行多情节

jit JAX函数中的迭代器

在内部列表上滚动窗口

如何删除索引过go 的lexsort深度可能会影响性能?' &>

未删除映射表的行

如何使用Python以编程方式判断和检索Angular网站的动态内容?

连接一个rabrame和另一个1d rabrame不是问题,但当使用[...]'运算符会产生不同的结果

给定高度约束的旋转角解析求解

如何在FastAPI中为我上传的json文件提供索引ID?

判断solve_ivp中的事件

Python全局变量递归得到不同的结果

Python—为什么我的代码返回一个TypeError

当条件满足时停止ODE集成?

Flask运行时无法在Python中打印到控制台

使用__json__的 pyramid 在客户端返回意外格式

如何在FastAPI中替换Pydantic的constr,以便在BaseModel之外使用?'

我可以不带视频系统的pygame,只用于游戏手柄输入吗?''