vod
vod copied to clipboard
🔮 Project: Multi-Vector Retrieval (ColBERT)
WHY Currently, VOD Training only complies with document-level embeddings. This represents each document with a single-vector representation, constraining the granularity of the contextual information captured.
ColBERT introduced a more complex interaction by encoding each passage into a matrix of token-level embeddings. During search, it further embeds every query into another matrix, allowing efficient passage retrieval that contextually matches the query using scalable vector-similarity operators.
The rich interactions enabled by ColBERT have been proven to surpass the quality of single-vector representation models. However, making it scale efficiently to large corpora is not trivial.
HOW The project will address the aforementioned goals through the following means:
Utilizing Fine-Grained Contextual Late Interaction:
- Leverage ColBERT's ability to encode queries and passages into sequences of token-level embeddings.
- Improve vod's on-disk data structures to handle 3-dimensional tensors with variable shapes (e.g., shape
N x ? x H
) - Implement ColBERT's
MaxSim
operator in the loss layer - Implement ColBERT's two-stage retrieval
Combine T5 Models with ColBERT:
- Benchmark ColT5 against ColBERT
- Benchmark the end-to-end search latency in search engine like Raffle.
Implement XTR: ContXextualized Token Retriever:
- Implement XTR loss
- Implement XTR one-stage retrieval
Refinements:
- Investigate Robust Multi-Hop Reasoning at Scale via Condensed Retrieval.
- Explore effective and efficient retrieval via Lightweight Late Interaction (e.g., PLAID)
WHAT The anticipated outcomes of this project include:
- State-of-the-art retrieval for RAG models (T5 + XTR)
- A scalable solution capable of handling large corpora without compromising efficiency.
References
- ColBERT: Efficient and Effective Passage Search via Contextualized Late Interaction over BERT (SIGIR'20).
- Relevance-guided Supervision for OpenQA with ColBERT (TACL'21).
- Baleen: Robust Multi-Hop Reasoning at Scale via Condensed Retrieval (NeurIPS'21).
- ColBERTv2: Effective and Efficient Retrieval via Lightweight Late Interaction (NAACL'22).
- PLAID: An Efficient Engine for Late Interaction Retrieval (CIKM'22).
- Rethinking the Role of Token Retrieval in Multi-Vector Retrieval