提前为这个非常长的问题道歉--这是一段漫长的旅程……因此,提前感谢您的耐心等待.

TL;DR:

在使用此设置时,Python会抛出段错误:

[NumPy代码]<--[python/C API]<--[C-实现的.so库]<--[使用ctype的py驱动程序‘cdll]

请注意,py驱动程序仅用于测试.so库,而lib最终打算用于另一个第三方应用程序.

背景

我们的团队有一些正在try 集成到另一个第三方应用程序中的Python代码,而有问题的应用程序需要以共享对象文件(.so库)的形式实现符号.为了实现这一点,我正在try 用C语言创建一个利用Python/C API的共享对象库.我遇到了一些让程序按预期运行的问题,所以我try 将问题归结为一个"尽可能简单"的示例:

我正在用C语言编写一个基本的共享对象库,使用的是Python/C API,特别是这个库的方法有numpy个依赖项(这一点在后面会很重要).

该库最终被打算由"预先存在的"(我不能控制所述程序的编译/链接等)的"运行时应用程序"(通过动态加载符号,例如dlfcn.h)使用.

我遇到了一个似乎是链接的问题,但可能是SO是如何编译和/或我如何通过C API使用/配置Python的问题.

我知道这是可能的,因为我在Internet上找到了一些现有的用例,但我似乎不能专注于我的设置失败的地方.

布设

假设我在ubuntu 20.04上安装了全系统的python3.8-dev(在本例中,我实际上是在一个docker容器中工作,所以如果最终可能出现系统设置问题,我也很乐意提供dockerfile).

此外,例如,我在~/venv中设置了一个虚拟环境,它被激活,并安装了NumPy--我可以用,例如,

(venv) user@4189d31a5bbe:~$ which python && python -c "import numpy; print(numpy)"
/home/user/venv/bin/python
<module 'numpy' from '/home/user/venv/lib/python3.8/site-packages/numpy/__init__.py'>

库文件

我有以下头文件/源文件:

mylibwithpy.h:

#ifndef __MYLIBWITHPY__
#define __MYLIBWITHPY__

#include <stdio.h>
#include <Python.h>

void someFunctionWithPython();

#endif

mylibwithpy.c:

函数someFunctionWithPython所做的全部工作就是判断是否初始化了python,如果没有初始化,则进行初始化,然后try 导入numpy.

#include "mylibwithpy.h"

void someFunctionWithPython()
{
    if (!Py_IsInitialized())
    {
        printf("Initializing python...\n");
        Py_Initialize();
    }
    else
    {
        printf("python alread initialized.\n");
    }

    printf("importing numpy...\n");
    PyObject* numpy = PyImport_ImportModule("numpy");
    if (numpy == NULL)
    {
        printf("Warning: error during import:\n");
        PyErr_Print();
        Py_Finalize();
        exit(1);
    }
    return;
}

库*.so文件通过以下100个目标进行编译:

mylibwithpy.o:
    gcc -L/usr/lib/x86_64-linux-gnu -I/usr/include/python3.8 -Wall -c mylibwithpy.c -o $@ -lpython3.8

mylibwithpy.so: mylibwithpy.o 
    gcc -L/usr/lib/x86_64-linux-gnu -Wall -fPIC -shared -Wl,-soname,$@ -o $@ mylibwithpy.o -lpython3.8

在这一点上,ldd似乎到目前为止都是"OK":

(venv) user@4189d31a5bbe:~$ ldd mylibwithpy.so 
    linux-vdso.so.1 (0x00007ffc36ba6000)
    libpython3.8.so.1.0 => /lib/x86_64-linux-gnu/libpython3.8.so.1.0 (0x00007f295c5ea000)
    libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f295c3f8000)
    libexpat.so.1 => /lib/x86_64-linux-gnu/libexpat.so.1 (0x00007f295c3ca000)
    libz.so.1 => /lib/x86_64-linux-gnu/libz.so.1 (0x00007f295c3ae000)
    libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f295c38b000)
    libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f295c385000)
    libutil.so.1 => /lib/x86_64-linux-gnu/libutil.so.1 (0x00007f295c37e000)
    libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f295c22f000)
    /lib64/ld-linux-x86-64.so.2 (0x00007f295cb4f000)

示例运行时

这就是事情开始出错的地方...

首先,让我们使用pythonctypes库来try 一个基本的python驱动程序.

driver.py

from ctypes import cdll

if __name__ == "__main__":

    print("opening mylibwithpy.so...");
    my_so = cdll.LoadLibrary("mylibwithpy.so")

    print(".so object: ", my_so)
    print(".so object's 'someFunctionWithPython': ", my_so.someFunctionWithPython)

    print("calling someFunctionWithPython...");
    my_so.someFunctionWithPython()

此基本脚本将导致在将numpytry 导入lib函数时出现(Segmentation fault)错误:

(venv) user@4189d31a5bbe:~$ LD_LIBRARY_PATH=. python driver.py 
opening mylibwithpy.so...
.so object:  <CDLL 'mylibwithpy.so', handle 19f43b0 at 0x7fd873b72610>
.so object's 'someFunctionWithPython':  <_FuncPtr object at 0x7fd873ac91c0>
calling someFunctionWithPython...
python alread initialized.
importing numpy...
Segmentation fault (core dumped)

好的,我真的不确定如何用start来调试这个家伙,所以让我们用C:

driver.c:

#include <dlfcn.h>
#include <stdio.h>
#include <stdlib.h>

int main()
{
    printf("opening mylibwithpy.so...\n");
    void* mylibwithpy_so = dlopen("mylibwithpy.so", RTLD_LAZY);
    if (mylibwithpy_so == NULL){
        printf("an error occurred during loading mylibwithpy.so: \n%s\n", dlerror());
        exit(1);
    }

    void (*soFunc)();
    soFunc = dlsym(mylibwithpy_so, "someFunctionWithPython");
    if (soFunc == NULL){
        printf("an error occurred during loading symbol someFunctionWithPython: \n%s\n", dlerror());
        exit(1);
    }

    soFunc();

    return 0;
}

通过以下方式编译此程序:

gcc -L/usr/lib/x86_64-linux-gnu -Wall driver.c -o cdriver -ldl

运行此驱动程序会报告一个有趣得多的详细错误:

(venv) user@4189d31a5bbe:~$ LD_LIBRARY_PATH=. ./cdriver 
opening mylibwithpy.so...
Initializing python...
importing numpy...
Warning: error during import:
Traceback (most recent call last):
  File "/home/user/venv/lib/python3.8/site-packages/numpy/core/__init__.py", line 23, in <module>
    from . import multiarray
  File "/home/user/venv/lib/python3.8/site-packages/numpy/core/multiarray.py", line 10, in <module>
    from . import overrides
  File "/home/user/venv/lib/python3.8/site-packages/numpy/core/overrides.py", line 6, in <module>
    from numpy.core._multiarray_umath import (
ImportError: /home/user/venv/lib/python3.8/site-packages/numpy/core/_multiarray_umath.cpython-38-x86_64-linux-gnu.so: undefined symbol: PyObject_SelfIter

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/user/venv/lib/python3.8/site-packages/numpy/__init__.py", line 141, in <module>
    from . import core
  File "/home/user/venv/lib/python3.8/site-packages/numpy/core/__init__.py", line 49, in <module>
    raise ImportError(msg)
ImportError: 

IMPORTANT: PLEASE READ THIS FOR ADVICE ON HOW TO SOLVE THIS ISSUE!

Importing the numpy C-extensions failed. This error can happen for
many reasons, often due to issues with your setup or how NumPy was
installed.

We have compiled some common reasons and troubleshooting tips at:

    https://numpy.org/devdocs/user/troubleshooting-importerror.html

Please note and check the following:

  * The Python version is: Python3.8 from "/home/user/venv/bin/python3"
  * The NumPy version is: "1.24.2"

and make sure that they are the versions you expect.
Please carefully study the documentation linked above for further help.

Original error was: /home/user/venv/lib/python3.8/site-packages/numpy/core/_multiarray_umath.cpython-38-x86_64-linux-gnu.so: undefined symbol: PyObject_SelfIter

啊!

根据这一点,NumPy似乎有自己的一些共享对象,但不知何故缺少了一些符号(准确地说,这个符号列在Python/C API "stable ABI contents"个符号中):

/home/user/venv/lib/python3.8/site-packages/numpy/core/_multiarray_umath.cpython-38-x86_64-linux-gnu.so: undefined symbol: PyObject_SelfIter

(另一个附注:列出的NumPy Docs参考编号https://numpy.org/devdocs/user/troubleshooting-importerror.html似乎不太适用于我正在收到的情况或错误,但更有洞察力的眼睛可能会发现一些我忽略的有用的东西……)

此外,快速判断ldd会发现libpython确实不在动态链接库中:

(venv) user@4189d31a5bbe:~$ ldd /home/user/venv/lib/python3.8/site-packages/numpy/core/_multiarray_umath.cpython-38-x86_64-linux-gnu.so
    linux-vdso.so.1 (0x00007ffed7df2000)
    libopenblas64_p-r0-15028c96.3.21.so => /home/user/venv/lib/python3.8/site-packages/numpy/core/../../numpy.libs/libopenblas64_p-r0-15028c96.3.21.so (0x00007f98a1ae6000)
    libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f98a198f000)
    libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f98a196c000)
    libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f98a177a000)
    /lib64/ld-linux-x86-64.so.2 (0x00007f98a3f29000)
    libgfortran-040039e1.so.5.0.0 => /home/user/venv/lib/python3.8/site-packages/numpy/core/../../numpy.libs/libgfortran-040039e1.so.5.0.0 (0x00007f98a12ed000)
    libquadmath-96973f99.so.0.0.0 => /home/user/venv/lib/python3.8/site-packages/numpy/core/../../numpy.libs/libquadmath-96973f99.so.0.0.0 (0x00007f98a10ae000)
    libz.so.1 => /lib/x86_64-linux-gnu/libz.so.1 (0x00007f98a1092000)
    libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007f98a1077000)

测试链接器问题假设

由于我们在cdriver程序中输出了一个未定义的符号错误,因此我可以try 通过以下方式将libpython强制链接到cdrive:

gcc -L/usr/lib/x86_64-linux-gnu -Wall driver.c -o cdriver -ldl -Wl,--no-as-needed -lpython3.8

看哪,这一次程序完成时没有错误:

(venv) user@4189d31a5bbe:~$ LD_LIBRARY_PATH=. ./cdriver 
opening mylibwithpy.so...
Initializing python...
importing numpy...
(venv) user@4189d31a5bbe:~$ 

Note在实际的构建中,我将无法编译/链接运行时程序,所以这个判断是一个解决方案,但似乎有助于诊断问题?

所以...深入到实际的问题

  1. 例如,要让python驱动程序按预期工作,需要做些什么?
  2. 为什么NumPy的内部共享对象文件没有链接到libpython3.8.so
  3. 我错过了什么?!:哭泣:

我希望我只是在编译.so或配置python时错过了一些很小但很关键的步骤.


EDIT:

我能够使用一个非常简单的 docker 容器始终如一地重现这个问题:

Dockerfile:

FROM ubuntu:20.04

RUN apt-get update; \
    DEBIAN_FRONTEND=noninteractive apt-get install -y \
            build-essential \
            vim \
            python3.8-dev \
            python3.8-venv

RUN useradd --create-home --shell /bin/bash user
USER user
WORKDIR /home/user
RUN python3 -m venv venv
RUN /bin/bash -c "source venv/bin/activate && pip install numpy"

Makefile:

all: mylibwithpy.so cdriver

clean:
    rm mylibwithpy.o mylibwithpy.so cdriver

cdriver:
    gcc -L/usr/lib/x86_64-linux-gnu -Wall driver.c -o $@ -ldl

mylibwithpy.o:
    gcc -L/usr/lib/x86_64-linux-gnu -I/usr/include/python3.8 -Wall -c mylibwithpy.c -o $@ -lpython3.8

mylibwithpy.so: mylibwithpy.o 
gcc -L/usr/lib/x86_64-linux-gnu -Wall -fPIC -shared -Wl,-soname,$@ -o $@ mylibwithpy.o -lpython3.8

然后使用与上面相同的命令执行

LD_LIBRARY_PATH=. ./cdriver

推荐答案

当使用由cdll.LoadLibrary导出的函数时,在进入该方法时将释放全局解释器锁(GIL).如果您想调用python代码,则需要重新获取锁.

例如:

void someFunctionWithPython()
{
    ...
    PyGILState_STATE state = PyGILState_Ensure();
    printf("importing numpy...\n");
    PyObject* numpy = PyImport_ImportModule("numpy");
    if (numpy == NULL)
    {
        printf("Warning: error during import:\n");
        PyErr_Print();
        Py_Finalize();
        PyGILState_Release(state);
        exit(1);
    }

    PyObject* repr = PyObject_Repr(numpy);
    PyObject* str = PyUnicode_AsEncodedString(repr, "utf-8", "~E~");
    const char *bytes = PyBytes_AS_STRING(str);

    printf("REPR: %s\n", bytes);

    Py_XDECREF(repr);
    Py_XDECREF(str);

    PyGILState_Release(state);
    return;
}
$ gcc $(python3.9-config --includes --ldflags --embed) -shared -o mylibwithpy.so mylibwithpy.c
$ LD_LIBRARY_PATH=. python driver.py
opening mylibwithpy.so...
.so object:  <CDLL 'mylibwithpy.so', handle 1749f50 at 0x7fb603702fa0>
.so object's 'someFunctionWithPython':  <_FuncPtr object at 0x7fb603679040>
calling someFunctionWithPython...
python alread initialized.
importing numpy...
REPR: <module 'numpy' from '/home/me/test/.venv/lib/python3.9/site-packages/numpy/__init__.py'>

另外,如果你看PyDLL个,它会说:

此类的实例的行为类似于CDLL实例,不同之处在于在函数调用期间没有释放Python Gil,并且在函数执行之后判断了Python错误标志.如果设置了错误标志,则会引发一个Python异常.

因此,如果您使用PyDLL作为您的驱动程序,那么您将不需要在C代码中重新获取锁:

from ctypes import PyDLL

if __name__ == "__main__":
    print("opening mylibwithpy.so...");
    my_so = PyDLL("mylibwithpy.so")

    print(".so object: ", my_so)
    print(".so object's 'someFunctionWithPython': ", my_so.someFunctionWithPython)

    print("calling someFunctionWithPython...");
    my_so.someFunctionWithPython()

UPDATE

为什么NumPy的内部共享对象文件没有链接到libpython3.8.so?

我相信NumPy是以这种方式设置的,因为它希望由已经加载了libpython并使符号可用的python解释器来调用.

这就是说,当mylibwithpy使用RTLD_GLOBAL调用NumPy的导入时,我们可以使python库可用.

此共享对象定义的符号将可用于后续加载的共享对象的符号解析.

代码的更新很简单:

void* mylibwithpy_so = dlopen("mylibwithpy.so", RTLD_LAZY | RTLD_GLOBAL);

现在将包括所有的python库,因为它们是mylibwithpy的依赖项,这意味着当numpy加载自己的共享库时,它们将可用.

或者,您可以 Select 在加载mylibwithpy.so之前只加载libpythonX.Y.soRTLD_GLOBAL,以最大限度地减少全球可用的元件数量.

printf("opening libpython3.9.so...\n");
void* libpython3_so = dlopen("libpython3.9.so", RTLD_LAZY | RTLD_GLOBAL);
if (libpython3_so == NULL){
    printf("an error occurred during loading libpython3.9.so: \n%s\n", dlerror());
    exit(1);
}

printf("opening mylibwithpy.so...\n");
void* mylibwithpy_so = dlopen("mylibwithpy.so", RTLD_LAZY);
if (mylibwithpy_so == NULL){
    printf("an error occurred during loading mylibwithpy.so: \n%s\n", dlerror());
    exit(1);
}

我用来重新创建和测试的Docker设置:

FROM ubuntu:20.04

ARG DEBIAN_FRONTEND=noninteractive

RUN apt-get update && apt-get install -y \
   build-essential \
   python3.9-dev \
   python3.9-venv 

RUN mkdir /workspace
WORKDIR /workspace

RUN python3.9 -m venv .venv
RUN .venv/bin/python -m pip install numpy

COPY . /workspace

RUN gcc -o mylibwithpy.so mylibwithpy.c -fPIC -shared \
    $(python3.9-config --includes --ldflags --embed --cflags) 

RUN gcc -o cdriver driver.c -L/usr/lib/x86_64-linux-gnu -Wall -ldl

ENV LD_LIBRARY_PATH=/workspace
# Then run: . .venv/bin/activate && ./cdriver

Python相关问答推荐

try 从网站获取表(ValueRight:如果使用所有纯量值,则必须传递索引)

在pandas DataFrame上运行apply()时如何访问DateTime索引?

如何在Pandas 中存储二进制数?

尽管进程输出错误消息,subProcess.check_call的CalledProcess错误.stderr为无

我可以使用极点优化这个面向cpu的pandas代码吗?

保留包含pandas pandras中文本的列

Python:记录而不是在文件中写入询问在多文件项目中记录的最佳实践

多处理代码在while循环中不工作

如何在msgraph.GraphServiceClient上进行身份验证?

我从带有langchain的mongoDB中的vector serch获得一个空数组

在Polars(Python库)中将二进制转换为具有非UTF-8字符的字符串变量

Python库:可选地支持numpy类型,而不依赖于numpy

如何获取numpy数组的特定索引值?

NumPy中条件嵌套for循环的向量化

为一个组的每个子组绘制,

如何根据一列的值有条件地 Select 前N组?

提取相关行的最快方法—pandas

与命令行相比,相同的Python代码在Companyter Notebook中运行速度慢20倍

如何在达到end_time时自动将状态字段从1更改为0

如何防止Pandas将索引标为周期?