rust-numpy What's the ideal way to convert `opencv::cv::Mat` from rust to `numpy.ndarray` in python?

I have a cv::Mat in Rust, but want to convey the data to Python, I have checked the doc and gone to PyArray3, but it requires a Vec<Vec<Vec<T>>>, which means I have to make up the Vec manually with one copy, and from PyArray3::from_vec3 it might clone another time, that's quite inefficient, what's the proper way to do that? Is there a way to prevent data clone?

    #[test]
    fn mat_2_numpy() -> PyResult<()> {
        Python::with_gil(|py| {
            println!("in gil");
            let func: Py<PyAny> = PyModule::from_code(
                py,
                c_str!(
                    "import numpy as np
def call_np(arr):
    print(\"Shape:\", arr.shape)
                "
                ),
                c_str!(""),
                c_str!(""),
            )?
            .getattr("call_np")?
            .into();

            let img = opencv::imgcodecs::imread("/some/path/sample.jpg", 0).unwrap();
            let shape = (img.rows(), img.cols(), img.channels());

            // it requires Vec<Vec<Vec<T>>>
            let array = PyArray3::from_vec3(py, v)?;

            // pass object with Rust tuple of positional arguments
            let args = (array,);
            let engine_obj = func.call1(py, args)?;

            Ok(())
        })

Mar 05 '25 07:03 tubzby

I think the most efficient way is going through

PyArray::from_array if you are able to borrow the original data (for example though ArrayBase::from_shape)
or PyArray::from_owned_array otherwise.

Either of them will still require one copy, either copying the borrowed array into Python, or copying into the owned array (which is then used as the backing buffer of the Python array).

Mar 05 '25 18:03 Icxolu

@Icxolu thank you.

Chatgpt tells me to do this:

            let array = PyArray::from_slice(py, img.data_bytes().unwrap());
            let array = array.reshape(dims.as_slice())?;

Is there still a copy happening under the hood? Are there any constraints that prevent us from sharing data between Rust and Python?

Mar 06 '25 01:03 tubzby

Yes, there will be still one copy here. The ownership model between Python and Rust is quite different. We can not track lifetime constraints across the boundary. If we would hand out a pointer to borrowed data to Python, it could be kept alive by Python while the original Rust owner was dropped, leading to a use after free. Mutability and concurrent access are also concerns.

It is possible to use an owned Rust buffer as a numpy array. PyArray::from_vec for example will not copy the data, but consume the Vec and use it as the backing memory. It is also possible to borrow the data from the numpy array

let vec = vec![1, 2, 3, 4, 5];
let pyarray = PyArray::from_vec(py, vec);
assert_eq!(pyarray.readonly().as_slice().unwrap(), &[1, 2, 3, 4, 5]);

Mar 06 '25 18:03 Icxolu

Yes, things like using after free should be the concern.

I'm using it in this fashion:

struct Engine {
    detect: Py<PyAny>,
}

impl Engine {

   pub fn new() -> Self {
       // load detect function from .py 
   }

   pub fn detect_frame(mat: &Mat) {
         let bs = mat.data_bytes()?;
         Python::with_gil(|py| {
            let array = PyArray::from_slice(py, bs);
            let array = array.reshape(dims.as_slice())?;
            let detect = self.detect.bind(py);
            let args = (array,);
            let result = detect.call1(args)?;
            ........                
         })
   }
}

A few notes:

I'm not owning the mat.
The mat is quite large (1920x1080 pixels), copy should be avoided at all costs.
This function is intensively called (30 fps each stream).
I can assure that no memory violation happens while holding GIL.

Is there any workaround, even some unsafe code is acceptable.

Mar 07 '25 00:03 tubzby

I don't think there is a sound way to do what you want here.

Mar 07 '25 20:03 Icxolu

I think this is a nice feature if we can avoid memory copy between Rust and Python.

Mar 08 '25 08:03 tubzby