invoking Python function that is implemented as a C extension module doesn't quite work
For context, I am working on a Rust binary that needs to call some Python code. This binary needs to load an internal Python package that is used for preprocessing input data that is then passed along to a user defined Python script. The internal Python package integrates with NumPy, and provides a classic C-API extension module that wraps the NumPy API. This extension module provides a higher order function that takes some input and returns a NumPy UFUNC that can be called.
I'm trying to invoke the higher order function from the Rust binary. Now, no errors are thrown, and everything works as expected. However, the data returned from the built function is incorrect. As in, it's like the function wasn't called at all. The data returned by the function matches the input (not exactly, as the datatype of the return result does change).
Below is some sample code to help illustrate what's being done; let build_scaler be a valid Python extension module function that takes an integer as input and returns a valid NumPy UFUNC that scales an ndarray:
type PyFunc<'py> = Bound<'py, PyAny>;
// internal Python package with the C extension module
struct PyPackage<'py> {
load_file: PyFunc<'py>,
// higher order extension module function
build_scaler: PyFunc<'py>,
}
impl<'py> PyPackage<'py> {
fn new(py: Python<'py>) -> Result<Self> {
// load the functions by using getattr...
Ok(Self {
load_file,
build_scaler,
})
}
fn load_data(&self, path: &Path, should_scale: bool, scale: i32) -> Result<Bound<'py, PyAny>> {
let args = (path,);
let data = self.load_file.call1(args)?;
let data = if should_scale {
let scaler = self.build_scaler((scale,))?;
let args = (data,);
let new_data = scaler.call1(args)?;
// new_data is the same as data, except the datatype has changed
// would expect every value in new_data to be multiplied by `scale`
new_data
} else {
data
};
Ok(data)
}
}
My work around for this was to write a little wrapper function in the Python package that calls the extension module code correctly and this worked as expected.
Just curious as to what's going on under the hood that I'm missing. Granted, I didn't try many different methods to try and resolve the issue, I simply got it working.
Thanks for the question. It's kinda unclear from the sample code you give above what went wrong, I see when you call build_scaler you wrote self.build_scaler((scale,)) which I assume is actually meant to be self.build_scaler.call1((scale,)), and that's just a sample transcription error.
If you have a small MVP which is easily run someone may be able to investigate if it looks like a legit bug in PyO3.
Perhaps need to ensure that the data is passed in the correct format
let data_args = PyTuple::new(py, [&data])?;
match scaler.call1(data_args) {...}