rust-numpy BitGenerator support

See

https://numpy.org/doc/stable/reference/random/c-api.html
https://numpy.org/doc/stable/reference/random/extending.html

Fixes #498

The idea is to have a safe wrapper around the npy_bitgen struct that implements rand::RngCore. That way pyo3 functions could be passed a np.random.Generator, get that wrapper from it, and pass it to Rust APIs, which could then call its methods repeatedly.

The way it’s implemented, the workflow would look like this:

acquire GIL
downcast a np.random.BitGenerator instance into a numpy::random::PyBitGenerator.
call .lock() on it to get a numpy::random::PyBitGeneratorGuard.
release GIL
call functions on guard object without needing to hold the GIL

TODO:

[ ] I see local crashes when running all tests, so there’s probably some UB, I’d appreciate help to fix it.

Safety

If somebody releases the threading lock of the BitGenerator while we’re using it, this isn’t safe :thinking:

API design options

I could make this more complex by adding a new trait that is implemented by both PyBitGenerator and PyBitGeneratorGuard, allowing to choose if someone wants to

use the PyBitGenerator’s random_* methods directly on that object while holding the GIL and without locking it
use it like it’s used now, by locking the np.random.BitGenerator and returning a GIL-free object that can be used.

but for now I just implemented the use case that’s actually desired.

Jun 06 '25 16:06 flying-sheep

Thanks for the comments, I’ll address them!

The main issue is that I think I’m triggering UB somehow and I don’t know how: when running all tests, often some unrelated test run after this one crashes …

Also, are there any differences between numpy v1 and v2 that we need to consider?

I didn’t forget about this either, will look!

/edit: the C API for random is there since 1.19: https://numpy.org/doc/1.26/reference/random/c-api.html

Jun 09 '25 21:06 flying-sheep

OK, with the release attr and changing the parallel test to use the explicit release as well, the UB now sometimes manifests as a lock poisoning error. progress?

Jun 10 '25 17:06 flying-sheep

I may have found a problem:

This fails as intended:

Python::with_gil(|py| {
    let obj = get_bit_generator(py)?;
    let a = obj.lock()?;
    let b = obj.lock()?;

    Ok::<_, PyErr>(())
})
.unwrap();

returning

called `Result::unwrap()` on an `Err` value: PyErr { type: <class 'RuntimeError'>, value: RuntimeError('BitGenerator is already locked'), traceback: None }

But this does not fail:

Python::with_gil(|py| {
    let a = get_bit_generator(py)?.lock()?;
    let b = get_bit_generator(py)?.lock()?;

    Ok::<_, PyErr>(())
})
.unwrap();

and crucially it gives the same pointers:

[src/random.rs:113:18] ptr = 0x00007b9f6be44cc0
[src/random.rs:113:18] *ptr = bitgen_t {
    state: 0x00007b9f6be44d08,
    next_uint64: 0x00007b9f6837d320,
    next_uint32: 0x00007b9f6837d370,
    next_double: 0x00007b9f6837d3f0,
    next_raw: 0x00007b9f6837d320,
}
[src/random.rs:113:18] ptr = 0x00007b9f6be44cc0
[src/random.rs:113:18] *ptr = bitgen_t {
    state: 0x00007b9f6be44d08,
    next_uint64: 0x00007b9f6837d320,
    next_uint32: 0x00007b9f6837d370,
    next_double: 0x00007b9f6837d3f0,
    next_raw: 0x00007b9f6837d320,
}

So when using multiple threads, for example multiple tests running in parallel, we have a data race on the state. I think we need a lock across all instances to make this work.

Jun 10 '25 18:06 Icxolu

Oh wow, so while default_rng(...).bit_generator.state is always different (somehow), default_rng(...).bit_generator.ctypes.state_address isn’t necessarily (somehow).

note the different seed sequence passed to the function, not even then is the state address different wtf:

>>> np.random.default_rng([1, 4]).bit_generator.ctypes.state_address
4355856392
>>> np.random.default_rng([2, 4]).bit_generator.ctypes.state_address
4355856392

I have no clue what to make of that. I just assumed different random state on the Python side means a different underlying struct, because how can that not be the case?

But anyway, you made me realize that the whole approach is flawed because the same BitGenerator can be passed from the Python side multiple times. So a generator passed from Python doesn’t have guaranteed independent state from another. Therefore if we want to use it, we’d have to use its threading lock as intended instead of abusing that lock into meaning “we can now do whatever we want with it”

So I think the way to go is instead of locking to use spawn to get independent child generators (which are always different, so we could use them to our hearts’ content, but also one should probably use one per thread anyway):

>>> [bg.ctypes.state_address for bg in np.random.default_rng().bit_generator.spawn(2)]
[4355860968, 4355862376]

Jun 11 '25 09:06 flying-sheep

Maybe we should skip the guard part and just lock and unlock within the RngCore implementation itself. Can you give an example for why you'd want this, and why the api has this form? Why would someone want to use this rather than the RngCore impl that rand ships with? Maybe we can come up with a better design.

Jun 11 '25 09:06 mejrs

When implementing Python-facing APIs, having a rng: np.random.Generator parameter is common. I want to write code that actually respects that parameter and uses it instead of ignoring it or calling it once to seed the actual generator.

Jun 11 '25 09:06 flying-sheep

Would it also possible to go the other way, i.e. provide a rand rng from Rust to Python as a numpy BitGenerator?

Jun 24 '25 20:06 juntyr

Yes, that's part of numpy's API as well!

Jun 24 '25 20:06 flying-sheep