Xtr
Pull Request
What does this PR do?
Integrate XTR Model Support and Modified colbertv2 Indexing
Notes:
- Replace
(issue)above ↑↑↑ with the issue this PR closes to automatically link the two. This must be done when the PR is created. - Add multiple
Closes #(issue)as needed. - If this PR is work towards but does not close an issue, simply tag the issue without mentioning
Closes.
Description
Describe the changes proposed by this PR below to give the reviewer context below ↓↓↓
This pull request focuses on integrating support for the XTR model, a highly efficient multi-vector model, and modifying the colbertv2 indexing to seamlessly incorporate this new model. Below are the key changes and additions made in this PR:
- Implemented XTR, a cutting-edge multi-vector model renowned for its efficiency and effectiveness. This implementation is based on the research paper.
- Adapted the colbertv2 indexing functionality to seamlessly integrate with the newly introduced XTR model.
- Made the model weights for the implemented XTR model readily accessible on the Hugging Face model repository, checkpoint.
- Notebook for indexing and retrieval is available here.
- It's worth noting that while this PR implements the XTR model based on the available research paper, the full official implementation from the authors is yet to be released.
- As of the latest update, the authors have only provided the inference code for the XTR model on 8th April 2024. Therefore, this implementation serves as an early integration of the XTR model into our system, pending the official release of the complete implementation from the authors.
Request Review
Be sure to request a review from one or more reviewers (unless the PR is to an unprotected branch).
Versioning
When opening a PR to make changes to PrimeQA (i.e. primeqa/) master, be sure to increment the version following
semantic versioning. The VERSION is stored here
and is incremented using bump2version {patch,minor,major} as described in the (development guide documentation)[development.html].
- [ ] Have you updated the VERSION?
- [ ] Or does this PR not change the
primeqapackage or was not into master?
After pulling in changes from master to an existing PR, ensure the VERSION is updated appropriately. This may require bumping the version again if it has been previously bumped.
If you're not quite ready yet to post a PR for review, feel free to open a draft PR.
Releases
After Merging
If merging into master and VERSION was updated, after this PR is merged:
- [ ] Create a release from the master with version equal to VERSION
- [ ] Not merging to master or VERSION not updated
Checklist
Review the following and mark as completed:
- [ ] Tag an issue or issues this PR addresses.
- [ ] Added description of changes proposed.
- [ ] Review requested as appropriate.
- [ ] Version bumped as appropriate.
- [ ] New classes, methods, and functions documented.
- [ ] Documentation for modified code is updated.
- [ ] Built documentation to confirm it renders as expected (see here).
- [ ] Code cleaned up and commented out code removed.
- [ ] Tests added to ensure all functionalities tested at >= 60% unit test coverage (see here).
- [ ] Code cleaned up and commented out code removed.
- [ ] Release created as needed after merging.