LEANN icon indicating copy to clipboard operation
LEANN copied to clipboard

Search with `recompute` second level latency for code RAG

Open ASuresh0524 opened this issue 1 month ago • 0 comments

What happened?

Title: Search with recompute takes ~15s even after warmup

Environment

  • macOS (darwin 24.6.0)
  • Python 3.10.18
  • LEANN branch: feature/colqwen-integration

Steps to Reproduce

  1. Create a tiny code repo:

    mkdir -p /tmp/quick_test_code
    cat <<'PY' > /tmp/quick_test_code/test.py
    def hello():
        return "world"
    
    class Test:
        pass
    PY
    
  2. Build an index:

    leann build quick-test-index \
      --docs /tmp/quick_test_code \
      --use-ast-chunking \
      --embedding-model facebook/contriever \
      --embedding-mode sentence-transformers
    
  3. Run searches (measure wall time):

    time leann search quick-test-index "hello" --top-k 3
    time leann search quick-test-index "Test" --top-k 3
    time leann search quick-test-index "function" --top-k 3
    

Observed

  • Build completes (~15 s).
  • Each search takes 13–19 s, even after multiple runs.
  • Warm vs cold only differs by ~1.2 s, so most time is spent recomputing embeddings per query.

Expected

  • Searches with recompute=True may be slower, but not 13–19 s for a trivial index.
  • Would like to understand if there’s a way to avoid full model reload or make on-the-fly query embedding faster.

How to reproduce

Follow steps from above.

Error message


LEANN Version

0.1.0

Operating System

macOS

ASuresh0524 avatar Nov 25 '25 09:11 ASuresh0524