LEANN
LEANN copied to clipboard
[Multi-vector]Add timing instrumentation and multi-dataset support for multi-vector…
… retrieval
- Add timing measurements for search operations (load and core time)
- Increase embedding batch size from 1 to 32 for better performance
- Add explicit memory cleanup with del all_embeddings
- Support loading and merging multiple datasets with different splits
- Add CLI arguments for search method selection (ann/exact/exact-all)
- Auto-detect image field names across different dataset structures
- Print candidate doc counts for performance monitoring
🤖 Generated with Claude Code
What does this PR do?
Related Issues
Fixes #
Checklist
- [ ] Tests pass (
uv run pytest) - [ ] Code formatted (
ruff formatandruff check) - [ ] Pre-commit hooks pass (
pre-commit run --all-files)