cuvs icon indicating copy to clipboard operation
cuvs copied to clipboard

[WIP] Add filtering support for IVF-PQ and BITMAP filtering for CAGRA

Open jeremywgleeson opened this issue 2 months ago • 6 comments

Summary

This PR adds filter support to the IVF-PQ C API (which previously had no filtering capability) and adds BITMAP filter support to the CAGRA and IVF-Flat C API (which only supported BITSET filters). Additionally, it adds strong filter support to the Rust crate, which was previously not exposed.

Changes

  1. IVF-PQ C API - Add Filter Support

    • Add cuvsFilter filter parameter to cuvsIvfPqSearch
    • Implement BITMAP (per-query) and BITSET (global) filter support
    • IVF-PQ previously had no filtering capability in the C API
  2. IVF-Flat C API - Add BITMAP Filter Support

    • Add BITMAP filter support to cuvsIvfFlatSearch
    • IVF-Flat previously only supported BITSET filters, now supports both
    • Matches full filtering functionality available in the C++ API
  3. CAGRA C API - Add BITMAP Filter Support

    • Add BITMAP filter support to cuvsCagraSearch
    • CAGRA previously only supported BITSET filters, now supports both
    • Matches full filtering functionality available in the C++ API
  4. Comprehensive Test Coverage

    • Add filtered search tests for IVF-PQ (ann_ivf_pq_c.cu)
    • Add filtered search tests for IVF-Flat (ann_ivf_flat_c.cu)
    • Add filtered search tests for CAGRA (ann_cagra_c.cu)
    • Each test suite includes both BITSET and BITMAP filter validation
    • Tests verify that filtered results respect the exclusion criteria
  5. Rust Language Bindings - Complete Filter Support

    • Add new filters.rs module with comprehensive filter utilities
    • BITSET Helpers:
      • bitset_from_excluded_indices() - Create global filter from excluded indices
      • bitset_from_included_indices() - Create global filter from included indices
    • BITMAP Helpers:
      • bitmap_from_excluded_indices() - Create per-query filters from excluded indices
      • bitmap_from_included_indices() - Create per-query filters from included indices
    • All functions follow idiomatic Rust patterns with proper error handling
    • Memory-safe wrappers around DLPack tensors for filter data
    • Comprehensive documentation and examples for each function
  6. Other Language Bindings

    • Update Python bindings to accept optional filter parameter
    • Update Go bindings to pass NO_FILTER by default

Backward Compatibility

C API - Breaking Changes:

  • ⚠️ BREAKING: cuvsIvfPqSearch now requires an additional trailing cuvsFilter filter parameter
  • Existing C code calling IVF-PQ search must be updated to pass a filter parameter (use {.type = NO_FILTER, .addr = (uintptr_t)NULL} for no filtering)
  • Note: IVF-Flat and CAGRA already had the filter parameter, so no breaking changes for those

(Based on #664 , it seems that this type of change is not considered breaking?)

Rust API - Breaking Changes:

  • ⚠️ BREAKING: Search function signatures updated to include filter parameter
  • Existing Rust code must be updated to pass a filter (use appropriate helper functions from new filters module or None)

Python API - Non-Breaking:

  • Filter parameter is optional with default value filter=None
  • Filter placed after resources parameter to maintain backward compatibility for positional arguments
  • Existing Python code continues to work unchanged

Go API - Non-Breaking:

  • Filter automatically initialized to NO_FILTER internally
  • Existing Go code continues to work unchanged

Testing

All new functionality is covered by unit tests that:

  • Create random datasets and queries
  • Apply filters to exclude even-indexed vectors (pattern: 0xAAAAAAAA)
  • Verify all returned neighbors are odd-indexed
  • Test both BITMAP and BITSET filter modes for each index type

Closes #1464

jeremywgleeson avatar Oct 29 '25 23:10 jeremywgleeson

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

copy-pr-bot[bot] avatar Oct 29 '25 23:10 copy-pr-bot[bot]

/ok to test 6ee63bf

cjnolet avatar Oct 30 '25 16:10 cjnolet

Thanks so much for the contribution @jeremywgleeson! This is a few important features that we've had on our roadmap and we really appreciate your help here.

cjnolet avatar Oct 30 '25 16:10 cjnolet

/ok to test 577d5de

cjnolet avatar Oct 30 '25 17:10 cjnolet

/ok to test 760b3d1

benfred avatar Oct 30 '25 18:10 benfred

/ok to test e6feca5

benfred avatar Oct 31 '25 17:10 benfred