提前为这个非常长的问题道歉--这是一段漫长的旅程……因此,提前感谢您的耐心等待.
TL;DR:
在使用此设置时,Python会抛出段错误:
[NumPy代码]<;--[python/C API]<;--[C-实现的.so库]<;--[使用ctype的py驱动程序‘cdll
]
请注意,py驱动程序仅用于测试.so库,而lib最终打算用于另一个第三方应用程序.
背景
我们的团队有一些正在try 集成到另一个第三方应用程序中的Python代码,而有问题的应用程序需要以共享对象文件(.so库)的形式实现符号.为了实现这一点,我正在try 用C语言创建一个利用Python/C API的共享对象库.我遇到了一些让程序按预期运行的问题,所以我try 将问题归结为一个"尽可能简单"的示例:
我正在用C语言编写一个基本的共享对象库,使用的是Python/C API,特别是这个库的方法有numpy
个依赖项(这一点在后面会很重要).
该库最终被打算由"预先存在的"(我不能控制所述程序的编译/链接等)的"运行时应用程序"(通过动态加载符号,例如dlfcn.h)使用.
我遇到了一个似乎是链接的问题,但可能是SO是如何编译和/或我如何通过C API使用/配置Python的问题.
我知道这是可能的,因为我在Internet上找到了一些现有的用例,但我似乎不能专注于我的设置失败的地方.
布设
假设我在ubuntu 20.04上安装了全系统的python3.8-dev(在本例中,我实际上是在一个docker容器中工作,所以如果最终可能出现系统设置问题,我也很乐意提供dockerfile).
此外,例如,我在~/venv
中设置了一个虚拟环境,它被激活,并安装了NumPy--我可以用,例如,
(venv) user@4189d31a5bbe:~$ which python && python -c "import numpy; print(numpy)"
/home/user/venv/bin/python
<module 'numpy' from '/home/user/venv/lib/python3.8/site-packages/numpy/__init__.py'>
库文件
我有以下头文件/源文件:
mylibwithpy.h
:
#ifndef __MYLIBWITHPY__
#define __MYLIBWITHPY__
#include <stdio.h>
#include <Python.h>
void someFunctionWithPython();
#endif
mylibwithpy.c
:
函数someFunctionWithPython
所做的全部工作就是判断是否初始化了python,如果没有初始化,则进行初始化,然后try 导入numpy
.
#include "mylibwithpy.h"
void someFunctionWithPython()
{
if (!Py_IsInitialized())
{
printf("Initializing python...\n");
Py_Initialize();
}
else
{
printf("python alread initialized.\n");
}
printf("importing numpy...\n");
PyObject* numpy = PyImport_ImportModule("numpy");
if (numpy == NULL)
{
printf("Warning: error during import:\n");
PyErr_Print();
Py_Finalize();
exit(1);
}
return;
}
库*.so文件通过以下100个目标进行编译:
mylibwithpy.o:
gcc -L/usr/lib/x86_64-linux-gnu -I/usr/include/python3.8 -Wall -c mylibwithpy.c -o $@ -lpython3.8
mylibwithpy.so: mylibwithpy.o
gcc -L/usr/lib/x86_64-linux-gnu -Wall -fPIC -shared -Wl,-soname,$@ -o $@ mylibwithpy.o -lpython3.8
在这一点上,ldd
似乎到目前为止都是"OK":
(venv) user@4189d31a5bbe:~$ ldd mylibwithpy.so
linux-vdso.so.1 (0x00007ffc36ba6000)
libpython3.8.so.1.0 => /lib/x86_64-linux-gnu/libpython3.8.so.1.0 (0x00007f295c5ea000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f295c3f8000)
libexpat.so.1 => /lib/x86_64-linux-gnu/libexpat.so.1 (0x00007f295c3ca000)
libz.so.1 => /lib/x86_64-linux-gnu/libz.so.1 (0x00007f295c3ae000)
libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f295c38b000)
libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f295c385000)
libutil.so.1 => /lib/x86_64-linux-gnu/libutil.so.1 (0x00007f295c37e000)
libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f295c22f000)
/lib64/ld-linux-x86-64.so.2 (0x00007f295cb4f000)
示例运行时
这就是事情开始出错的地方...
首先,让我们使用pythonctypes
库来try 一个基本的python驱动程序.
driver.py
from ctypes import cdll
if __name__ == "__main__":
print("opening mylibwithpy.so...");
my_so = cdll.LoadLibrary("mylibwithpy.so")
print(".so object: ", my_so)
print(".so object's 'someFunctionWithPython': ", my_so.someFunctionWithPython)
print("calling someFunctionWithPython...");
my_so.someFunctionWithPython()
此基本脚本将导致在将numpytry 导入lib函数时出现(Segmentation fault)
错误:
(venv) user@4189d31a5bbe:~$ LD_LIBRARY_PATH=. python driver.py
opening mylibwithpy.so...
.so object: <CDLL 'mylibwithpy.so', handle 19f43b0 at 0x7fd873b72610>
.so object's 'someFunctionWithPython': <_FuncPtr object at 0x7fd873ac91c0>
calling someFunctionWithPython...
python alread initialized.
importing numpy...
Segmentation fault (core dumped)
好的,我真的不确定如何用start来调试这个家伙,所以让我们用C:
driver.c
:
#include <dlfcn.h>
#include <stdio.h>
#include <stdlib.h>
int main()
{
printf("opening mylibwithpy.so...\n");
void* mylibwithpy_so = dlopen("mylibwithpy.so", RTLD_LAZY);
if (mylibwithpy_so == NULL){
printf("an error occurred during loading mylibwithpy.so: \n%s\n", dlerror());
exit(1);
}
void (*soFunc)();
soFunc = dlsym(mylibwithpy_so, "someFunctionWithPython");
if (soFunc == NULL){
printf("an error occurred during loading symbol someFunctionWithPython: \n%s\n", dlerror());
exit(1);
}
soFunc();
return 0;
}
通过以下方式编译此程序:
gcc -L/usr/lib/x86_64-linux-gnu -Wall driver.c -o cdriver -ldl
运行此驱动程序会报告一个有趣得多的详细错误:
(venv) user@4189d31a5bbe:~$ LD_LIBRARY_PATH=. ./cdriver
opening mylibwithpy.so...
Initializing python...
importing numpy...
Warning: error during import:
Traceback (most recent call last):
File "/home/user/venv/lib/python3.8/site-packages/numpy/core/__init__.py", line 23, in <module>
from . import multiarray
File "/home/user/venv/lib/python3.8/site-packages/numpy/core/multiarray.py", line 10, in <module>
from . import overrides
File "/home/user/venv/lib/python3.8/site-packages/numpy/core/overrides.py", line 6, in <module>
from numpy.core._multiarray_umath import (
ImportError: /home/user/venv/lib/python3.8/site-packages/numpy/core/_multiarray_umath.cpython-38-x86_64-linux-gnu.so: undefined symbol: PyObject_SelfIter
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/user/venv/lib/python3.8/site-packages/numpy/__init__.py", line 141, in <module>
from . import core
File "/home/user/venv/lib/python3.8/site-packages/numpy/core/__init__.py", line 49, in <module>
raise ImportError(msg)
ImportError:
IMPORTANT: PLEASE READ THIS FOR ADVICE ON HOW TO SOLVE THIS ISSUE!
Importing the numpy C-extensions failed. This error can happen for
many reasons, often due to issues with your setup or how NumPy was
installed.
We have compiled some common reasons and troubleshooting tips at:
https://numpy.org/devdocs/user/troubleshooting-importerror.html
Please note and check the following:
* The Python version is: Python3.8 from "/home/user/venv/bin/python3"
* The NumPy version is: "1.24.2"
and make sure that they are the versions you expect.
Please carefully study the documentation linked above for further help.
Original error was: /home/user/venv/lib/python3.8/site-packages/numpy/core/_multiarray_umath.cpython-38-x86_64-linux-gnu.so: undefined symbol: PyObject_SelfIter
啊!
根据这一点,NumPy似乎有自己的一些共享对象,但不知何故缺少了一些符号(准确地说,这个符号列在Python/C API "stable ABI contents"个符号中):
/home/user/venv/lib/python3.8/site-packages/numpy/core/_multiarray_umath.cpython-38-x86_64-linux-gnu.so: undefined symbol: PyObject_SelfIter
(另一个附注:列出的NumPy Docs参考编号https://numpy.org/devdocs/user/troubleshooting-importerror.html似乎不太适用于我正在收到的情况或错误,但更有洞察力的眼睛可能会发现一些我忽略的有用的东西……)
此外,快速判断ldd
会发现libpython
确实不在动态链接库中:
(venv) user@4189d31a5bbe:~$ ldd /home/user/venv/lib/python3.8/site-packages/numpy/core/_multiarray_umath.cpython-38-x86_64-linux-gnu.so
linux-vdso.so.1 (0x00007ffed7df2000)
libopenblas64_p-r0-15028c96.3.21.so => /home/user/venv/lib/python3.8/site-packages/numpy/core/../../numpy.libs/libopenblas64_p-r0-15028c96.3.21.so (0x00007f98a1ae6000)
libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f98a198f000)
libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f98a196c000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f98a177a000)
/lib64/ld-linux-x86-64.so.2 (0x00007f98a3f29000)
libgfortran-040039e1.so.5.0.0 => /home/user/venv/lib/python3.8/site-packages/numpy/core/../../numpy.libs/libgfortran-040039e1.so.5.0.0 (0x00007f98a12ed000)
libquadmath-96973f99.so.0.0.0 => /home/user/venv/lib/python3.8/site-packages/numpy/core/../../numpy.libs/libquadmath-96973f99.so.0.0.0 (0x00007f98a10ae000)
libz.so.1 => /lib/x86_64-linux-gnu/libz.so.1 (0x00007f98a1092000)
libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007f98a1077000)
测试链接器问题假设
由于我们在cdriver
程序中输出了一个未定义的符号错误,因此我可以try 通过以下方式将libpython强制链接到cdrive:
gcc -L/usr/lib/x86_64-linux-gnu -Wall driver.c -o cdriver -ldl -Wl,--no-as-needed -lpython3.8
看哪,这一次程序完成时没有错误:
(venv) user@4189d31a5bbe:~$ LD_LIBRARY_PATH=. ./cdriver
opening mylibwithpy.so...
Initializing python...
importing numpy...
(venv) user@4189d31a5bbe:~$
Note在实际的构建中,我将无法编译/链接运行时程序,所以这个判断是一个解决方案,但似乎有助于诊断问题?
所以...深入到实际的问题
- 例如,要让python驱动程序按预期工作,需要做些什么?
- 为什么NumPy的内部共享对象文件没有链接到
libpython3.8.so
? - 我错过了什么?!:哭泣:
我希望我只是在编译.so或配置python时错过了一些很小但很关键的步骤.
EDIT:
我能够使用一个非常简单的 docker 容器始终如一地重现这个问题:
Dockerfile
:
FROM ubuntu:20.04
RUN apt-get update; \
DEBIAN_FRONTEND=noninteractive apt-get install -y \
build-essential \
vim \
python3.8-dev \
python3.8-venv
RUN useradd --create-home --shell /bin/bash user
USER user
WORKDIR /home/user
RUN python3 -m venv venv
RUN /bin/bash -c "source venv/bin/activate && pip install numpy"
Makefile
:
all: mylibwithpy.so cdriver
clean:
rm mylibwithpy.o mylibwithpy.so cdriver
cdriver:
gcc -L/usr/lib/x86_64-linux-gnu -Wall driver.c -o $@ -ldl
mylibwithpy.o:
gcc -L/usr/lib/x86_64-linux-gnu -I/usr/include/python3.8 -Wall -c mylibwithpy.c -o $@ -lpython3.8
mylibwithpy.so: mylibwithpy.o
gcc -L/usr/lib/x86_64-linux-gnu -Wall -fPIC -shared -Wl,-soname,$@ -o $@ mylibwithpy.o -lpython3.8
然后使用与上面相同的命令执行
LD_LIBRARY_PATH=. ./cdriver