Is ColBERT model provide the right result for query and document vector embedding and rerank?
System Info
docker docker.io/michaelf34/infinity:0.0.75
not set gpu
Model: jinaai/jina-colbert-v2
Information
- [ ] Docker + cli
- [ ] pip + cli
- [ ] pip + usage of Python interface
Tasks
- [ ] An officially supported CLI command
- [ ] My own modifications
Reproduction
Reference
https://jina.ai/news/jina-colbert-v2-multilingual-late-interaction-retriever-for-embedding-and-reranking/
Embeddings
It said that there is an input_type params (query|document) return different Embeddings.
But i found that input_type is useless here. dimensions size return by token size
And dimensions set a big size for small token size will return error
and the shape is always [f(token.size), 1028], if set dimensions returns shape [dimensions,1028]
However I have try the jina api:
https://jina.ai/api-dashboard/embedding
and make test here (replace <JINA_TOKEN> below):
It returns shape [f(token.size),dimensions] for document, and [32,dimensions] for query.
curl 'https://api.jina.ai/v1/multi-vector' \
-H 'Content-Type: application/json' \
-H 'Authorization: Bearer <JINA_TOKEN>' \
-d '{
"model": "jina-colbert-v2",
"dimensions": 64,
"input_type": "document",
"embedding_type": "float",
"input": [
"Your document text string goes here",
"You can send multiple texts",
"Each text can be up to 8192 tokens long"
]}' > document-64.json
curl 'https://api.jina.ai/v1/multi-vector' \
-H 'Content-Type: application/json' \
-H 'Authorization: Bearer <JINA_TOKEN>' \
-d '{
"model": "jina-colbert-v2",
"dimensions": 128,
"input_type": "document",
"embedding_type": "float",
"input": [
"Your document text string goes here",
"You can send multiple texts",
"Each text can be up to 8192 tokens long"
]}' > document-128.json
curl 'https://api.jina.ai/v1/multi-vector' \
-H 'Content-Type: application/json' \
-H 'Authorization: Bearer <JINA_TOKEN>' \
-d '{
"model": "jina-colbert-v2",
"dimensions": 128,
"input_type": "query",
"embedding_type": "float",
"input": [
"Your document text string goes here",
"You can send multiple texts",
"Each text can be up to 8192 tokens long"
]}' > query-128.json
Take a look into query-128.json and document-128.json. The same text has different vector.
This behavior is different from infinity implementation. infinity return the same vector.
Rerank
And jina-colbert-v2 also has /rerank api, but swagger ui returns:
{
"error": {
"message": "ModelNotDeployedError: model=`jinaai/jina-colbert-v2` does not support `rerank`. Reason: the loaded moded cannot fullyfill `rerank`. Options are {'embed'}.",
"type": null,
"param": null,
"code": 400
}
}
here is jina result: rank.json
curl 'https://api.jina.ai/v1/rerank' \
-H 'Content-Type: application/json' \
-H 'Authorization: Bearer <JINA_TOKEN>' \
-d '{
"model": "jina-colbert-v2",
"query": "document text",
"top_n": 3,
"documents": [
"Your document text string goes here",
"You can send multiple texts",
"Each text can be up to 8192 tokens long"
]
}' > rank.json
You can't rerank with this model on infinity. You need to write your own reranking operation.
Reranking with large states is a database operation, infinity is a stateless inference server.