LEANN icon indicating copy to clipboard operation
LEANN copied to clipboard

[Multi-vector]Add timing instrumentation and multi-dataset support for multi-vector…

Open yichuan-w opened this issue 2 months ago • 0 comments

… retrieval

  • Add timing measurements for search operations (load and core time)
  • Increase embedding batch size from 1 to 32 for better performance
  • Add explicit memory cleanup with del all_embeddings
  • Support loading and merging multiple datasets with different splits
  • Add CLI arguments for search method selection (ann/exact/exact-all)
  • Auto-detect image field names across different dataset structures
  • Print candidate doc counts for performance monitoring

🤖 Generated with Claude Code

What does this PR do?

Related Issues

Fixes #

Checklist

  • [ ] Tests pass (uv run pytest)
  • [ ] Code formatted (ruff format and ruff check)
  • [ ] Pre-commit hooks pass (pre-commit run --all-files)

yichuan-w avatar Nov 10 '25 21:11 yichuan-w