usearch Feature: Improved rust bindings

Describe what you are looking for

Hello, while using the Rust bindings to Usearch I have come across several possible improvements, which I have implemented in an experimental fork.

Add the scalar type, metric type, dimensionality as generic argument to the Index type because they dont change much / cant be changed anyway:

//something like this
pub struct HighLevel<T: VectorType, const D: usize, M: MetricType> {
    _t: PhantomData<T>,
    _m: PhantomData<M>,
    pub(crate) index: Index,
}
impl<T,D,M> HighLevel<T,D,M> {
   pub fn try_default() -> Result<Self, cxx::Exception> {
        let mut options = IndexOptions::default();
        options.dimensions = D;
        options.metric = M::get_kind();
        options.quantization = T::quant_type();
        Self::make_index(&options)
    }
}
//so that you could just do this to initialize an index with default expansion etc.
let index = HighLevel::<f32,4,Cos>::try_default().expect("Failed to create index.");

Add support for the half crate instead of custom f16 Adding support for the half crate allows for better support and interoperability with the rest of the rust ecosystem, like serialisation.
Remove openmp/replace openmp with rayon The OpenMP feature doesnt seem to do anything, and there is no Rust support for OpenMP because Rayon roughly takes that spot.

impl<T: VectorType + Sync, const D: usize, M: MetricType + Sync> HighLevel<T, D, M> {
    /// Adds a batch of vectors with multithreading
    /// Faster when inserting large amounts of vectors;
    /// Slower when inserting smaller batches because spinning up the thread pool adds latency.
    /// # Parameters
    /// - `batch`

    ///
    /// # Returns
    /// - `Ok(())` if the batch was inserted successfully
    /// - `Err(cxx::Exception)` if an error occurred during the operation.
    pub fn batch_insert(&self, batch: &[(Key, &[T])]) -> Result<(), cxx::Exception> {
        let len = batch.len();
        self.reserve(len)?;
        batch
            .par_iter()
            .try_for_each(|(key, value)| self.index.add(*key, &value))?;
        Ok(())
    }
}

If any of these changes are interesting to you; I could fix up my implementation and provide it.

Can you contribute to the implementation?

[x] I can contribute

Is your feature request specific to a certain interface?

Other bindings

Contact Details

No response

Is there an existing issue for this?

[x] I have searched the existing issues

Code of Conduct

[x] I agree to follow this project's Code of Conduct

Jul 22 '25 08:07 jbrummack

Hi @jbrummack! The 2 & 3 are reasonable recommendations, aligned with future development goals, but here's how I'd adjust them:

The half crate is quite limited in its functionality. We already depend on the SimSIMD kernels, and it already provides both f16 & bf16 types since 6.5, and will support even more custom numerics in the future. We can change SimSIMD from being a compile-time C-level dependency into being a runtime Rust-level dependency. Once the Rust crate is loaded, its extern C symbols will anyway become visible. Our PyPi Python builds already use that trick and decouple SimSIMD versioning from USearch.
Removing OpenMP & custom threading functionality in USearch is another relevant goal, but Rayon is hardly the way forward. I recently wrote a piece - "Fork Union: Beyond OpenMP in C++ and Rust?", showcasing noticeable gap between OpenMP & Fork Union and more commonly used thread-pool libraries in C++ and Rust. So it's wiser to replace OpenMP with Fork Union. That's precisely the reason it's been created, but I'm more focused on integrating it into USearch v3, rather than the current v2. Feel free to open a patch if you want to merge it into v2 as well.
```
$ PARALLEL_REDUCTIONS_LENGTH=1536 cargo +nightly bench -- --output-format bencher

test fork_union ... bench:  5,150 ns/iter (+/- 402)
test rayon ... bench:      47,251 ns/iter (+/- 3,985)
test smol ... bench:       54,931 ns/iter (+/- 10)
test tokio ... bench:     240,707 ns/iter (+/- 921)
```

As for the first point, there's a reason we avoid templates/generics. We want the application to be able to read/write arbitrary Index objects from/to disk, without having to instantiate different templates. So let's skip that one for now.

Jul 22 '25 10:07 ashvardanian

As for the first point, there's a reason we avoid templates/generics. We want the application to be able to read/write arbitrary Index objects from/to disk, without having to instantiate different templates. So let's skip that one for now.

A common pattern to handle this in Rust is to have an enum/tagged union DynamicIndex { F32(Index<f32>), F16(Index<f16>), ... }, seen in symphonia or image.

Methods that don't have the element type in their signature can be implemented on that enum type, and since all variants are known it doesn't require any dynamic dispatch, just a regular jump table.

Jul 25 '25 09:07 tnibler

The half crate is quite limited in its functionality. We already depend on the SimSIMD kernels, and it already provides both f16 & bf16 types since 6.5, and will support even more custom numerics in the future. We can change SimSIMD from being a compile-time C-level dependency into being a runtime Rust-level dependency. Once the Rust crate is loaded, its extern C symbols will anyway become visible. Our PyPi Python builds already use that trick and decouple SimSIMD versioning from USearch.

I meant using the half crate as an interface because other crates use it as an interface: because right now we have to do something like this:

pub fn add_f16_from_hypothetical_json(ix: usearch::Index, data: &str) {
    let arr: Vec<half::f16> = serde_json::from_str(data).unwrap(); //[half::f16::from_f32(0.0); 32];
    let i16s: &[i16] = bytemuck::cast_slice(&arr);
    let halfarr = usearch::f16::from_i16s(i16s);
    ix.add(1, halfarr).unwrap();
}

when it could be this

pub fn add_f16_from_hypothetical_json(ix: usearch::Index, data: &str) {
    let arr: Vec<half::f16> = serde_json::from_str(data).unwrap(); //[half::f16::from_f32(0.0); 32];
    ix.add(1, &arr).unwrap();
}

in the end the &[f16] is some slice which gets interpreted on the c++ side that could come from the usearch f16 type or the half crate. I also understand not wanting to pull another dependency in, maybe a feature gate could be added that switches the f16 implementation with the one from the half crate if needed.

3: The fork union results look very impressive, i am going to have a look at that.

As for the first point, there's a reason we avoid templates/generics. We want the application to be able to read/write arbitrary Index objects from/to disk, without having to instantiate different templates. So let's skip that one for now.

That makes sense, I like to have to write less if I dont need runtime flexibility as a sort of add on; probably just going to build a separate crate for that or something.

Jul 27 '25 10:07 jbrummack

That makes sense, I like to have to write less if I dont need runtime flexibility as a sort of add on; probably just going to build a separate crate for that or something.

Sure, @jbrummack! I think writing a different third-party binding for Rust for USearch - wrapping the lower-level typed index_gt, rather than type-punned index_dense_gt is an option.

I also understand not wanting to pull another dependency in, maybe a feature gate could be added that switches the f16 implementation with the one from the half crate if needed.

Yes, having too many low-impact dependencies isn't a good idea. Feature-gating is an option if the half crate is indeed popular among USearch users.

Jul 27 '25 18:07 ashvardanian