C++和NumPy之间的Python绑定中复杂的C++生命周期问题

发布于02月14日

我正在寻求关于如何处理C++和NumPy/Python之间复杂的生命周期问题的建议.很抱歉给你留了一大堆文字，但我想提供尽可能多的内容.

我开发了cvnp，这是一个库，它在cv::Mat到py::array个对象之间提供绑定之间的强制转换，因此在使用pybind11时，内存在两者之间共享. 它最初是基于a SO answer× Dan Mašek 的.一切都很顺利，该库被用于几个项目，包括robotpy，这是第一届机器人大赛的Python库.

然而，an issue是由用户引发的，它处理链接的cv::Mat和py::array对象的生存期.

在方向cv::Mat - py::array上，一切都很好，因为mat_to_nparray将创建一个py::array，它通过"capsule"(一个python句柄)保持对链接的cv::Mat的引用.
然而，在方向py::array-&gt；cv::Mat,nparray_to_mat中，cv::mat将访问py::数组的数据，而不引用该数组(因此，不能保证py::数组的生存期与cv::mat相同)

请参见mat_to_nparray:

py::capsule make_capsule_mat(const cv::Mat& m)
{
    return py::capsule(new cv::Mat(m)
        , [](void *v) { delete reinterpret_cast<cv::Mat*>(v); }
    );
}

pybind11::array mat_to_nparray(const cv::Mat& m)
{
    return pybind11::array(detail::determine_np_dtype(m.depth())
        , detail::determine_shape(m)
        , detail::determine_strides(m)
        , m.data
        , detail::make_capsule_mat(m)
        );
}

和nparray_to_mat:

cv::Mat nparray_to_mat(pybind11::array& a)
{
    ...
    cv::Mat m(size, type, is_not_empty ? a.mutable_data(0) : nullptr);
    return m;
}

到目前为止，这个方法运行得很好，直到一位用户写道:

返回作为参数传递的相同cv::mat的绑定c++函数

m.def("test", [](cv::Mat mat) { return mat; });

使用此函数的一些Python代码

img = np.zeros(shape=(480, 640, 3), dtype=np.uint8)
img = test(img)

在这种情况下，可能会发生分段错误，因为py::array对象在cv::Mat对象之前被销毁，并且cv::Mat对象试图访问py::array对象的数据.然而，分段故障不是系统性的，并且取决于OS+PYTHON版本.

我使用ASAN通过this commit在CI中复制了它. 重现代码相当简单:

void test_lifetime()
{
    // We need to create a big array to trigger a segfault
    auto create_example_array = []() -> pybind11::array
    {
        constexpr int rows = 1000, cols = 1000;
        std::vector<pybind11::ssize_t> a_shape{rows, cols};
        std::vector<pybind11::ssize_t> a_strides{};
        pybind11::dtype a_dtype = pybind11::dtype(pybind11::format_descriptor<int32_t>::format());
        pybind11::array a(a_dtype, a_shape, a_strides);
        // Set initial values
        for(int i=0; i<rows; ++i)
            for(int j=0; j<cols; ++j)
                *((int32_t *)a.mutable_data(j, i)) = j * rows + i;

        printf("Created array data address =%p\n%s\n",
               a.data(),
               py::str(a).cast<std::string>().c_str());
        return a;
    };

    // Let's reimplement the bound version of the test function via pybind11:
    auto test_bound = [](pybind11::array& a) {
        cv::Mat m = cvnp::nparray_to_mat(a);
        return cvnp::mat_to_nparray(m);
    };

    // Now let's reimplement the failing python code in C++
    //    img = np.zeros(shape=(480, 640, 3), dtype=np.uint8)
    //    img = test(img)
    auto img = create_example_array();
    img = test_bound(img);

    // Let's try to change the content of the img array
    *((int32_t *)img.mutable_data(0, 0)) = 14;  // This triggers an error that ASAN catches
    printf("img data address =%p\n%s\n",
           img.data(),
           py::str(img).cast<std::string>().c_str());
}

我在寻求如何处理这个问题的建议.我看到了几个 Select :

理想的解决方案是

在nparray_to_mat内部构造cv::mat时调用pybind11::array.inc_ref()
确保在销毁此特定实例时调用pybind11::array.dec_ref(). 然而，我不知道该怎么做.

注意:我知道cv::mat可以使用自定义分配器，但在这里没用，因为cv::mat本身不会分配内存，但会使用py::数组对象的内存.

感谢您阅读到目前为止，并提前感谢您的任何建议！

// Translated from cv2_numpy.cpp in OpenCV source code class CvnpAllocator : public cv::MatAllocator { public: CvnpAllocator() = default; ~CvnpAllocator() = default; // Attaches a numpy array object to a cv::Mat static void attach_nparray(cv::Mat &m, pybind11::array& a) { static CvnpAllocator instance; cv::UMatData* u = new cv::UMatData(&instance); u->data = u->origdata = (uchar*)a.mutable_data(0); u->size = a.size(); // This is the secret sauce: we inc the number of ref of the array u->userdata = a.inc_ref().ptr(); u->refcount = 1; m.u = u; m.allocator = &instance; } cv::UMatData* allocate(int dims0, const int* sizes, int type, void* data, size_t* step, cv::AccessFlag flags, cv::UMatUsageFlags usageFlags) const override { throw py::value_error("CvnpAllocator::allocate \"standard\" should never happen"); // return stdAllocator->allocate(dims0, sizes, type, data, step, flags, usageFlags); } bool allocate(cv::UMatData* u, cv::AccessFlag accessFlags, cv::UMatUsageFlags usageFlags) const override { throw py::value_error("CvnpAllocator::allocate \"copy\" should never happen"); // return stdAllocator->allocate(u, accessFlags, usageFlags); } void deallocate(cv::UMatData* u) const override { if(!u) return; // This function can be called from anywhere, so need the GIL py::gil_scoped_acquire gil; assert(u->urefcount >= 0); assert(u->refcount >= 0); if(u->refcount == 0) { PyObject* o = (PyObject*)u->userdata; Py_XDECREF(o); delete u; } }; cv::Mat nparray_to_mat(pybind11::array& a) { bool is_contiguous = is_array_contiguous(a); bool is_not_empty = a.size() != 0; if (! is_contiguous && is_not_empty) { throw std::invalid_argument("cvnp::nparray_to_mat / Only contiguous numpy arrays are supported. / Please use np.ascontiguousarray() to convert your matrix"); } int depth = detail::determine_cv_depth(a.dtype()); int type = detail::determine_cv_type(a, depth); cv::Size size = detail::determine_cv_size(a); cv::Mat m(size, type, is_not_empty ? a.mutable_data(0) : nullptr); if (is_not_empty) { detail::CvnpAllocator::attach_nparray(m, a); //, ndims, size, type, step); } return m; }

C++和NumPy之间的Python绑定中复杂的C++生命周期问题

推荐答案

Python相关问答推荐

Python在tuple上操作不会通过整个单词匹配

对Numpy函数进行载体化

将特定列信息移动到当前行下的新行

Matlab中是否有Python的f-字符串等效物

Gekko：Spring-Mass系统的参数识别

聚合具有重复元素的Python字典列表，并添加具有重复元素数量的新键

计算组中唯一值的数量

当点击tkinter菜单而不是菜单选项时，如何执行命令？

计算天数

LocaleError：模块keras._' tf_keras. keras没有属性__internal_'''

python—telegraph—bot send_voice发送空文件

用SymPy在Python中求解指数函数

Gekko中基于时间的间隔约束

如何在海上配对图中使某些标记周围的黑色边框

如何在Gekko中使用分层条件约束

如何使用pytest在traceback中找到特定的异常

启动线程时，Python键盘模块冻结/不工作

Django更新视图未更新

如何在Python中实现高效地支持字典和堆操作的缓存？

`Convert_time_zone`函数用于根据为极点中的每一行指定的时区检索值