Is ColBERT model provide the right result for query and document vector embedding and rerank?

Open irelance opened this issue 9 months ago • 1 comments

System Info

docker docker.io/michaelf34/infinity:0.0.75

not set gpu

Model: jinaai/jina-colbert-v2

Information

[ ] Docker + cli
[ ] pip + cli
[ ] pip + usage of Python interface

Tasks

[ ] An officially supported CLI command
[ ] My own modifications

Reproduction

Reference

https://jina.ai/news/jina-colbert-v2-multilingual-late-interaction-retriever-for-embedding-and-reranking/

Embeddings

It said that there is an input_type params (query|document) return different Embeddings.

But i found that input_type is useless here. dimensions size return by token size And dimensions set a big size for small token size will return error

and the shape is always [f(token.size), 1028], if set dimensions returns shape [dimensions,1028]

However I have try the jina api: https://jina.ai/api-dashboard/embedding and make test here (replace <JINA_TOKEN> below): It returns shape [f(token.size),dimensions] for document, and [32,dimensions] for query.

document-64.json

curl 'https://api.jina.ai/v1/multi-vector' \
	 -H 'Content-Type: application/json' \
	 -H 'Authorization: Bearer <JINA_TOKEN>' \
	 -d '{
	"model": "jina-colbert-v2",
	"dimensions": 64,
	"input_type": "document",
	"embedding_type": "float",
	"input": [
		"Your document text string goes here",
		"You can send multiple texts",
		"Each text can be up to 8192 tokens long"
    ]}'  > document-64.json

document-128.json

curl 'https://api.jina.ai/v1/multi-vector' \
	 -H 'Content-Type: application/json' \
	 -H 'Authorization: Bearer <JINA_TOKEN>' \
	 -d '{
	"model": "jina-colbert-v2",
	"dimensions": 128,
	"input_type": "document",
	"embedding_type": "float",
	"input": [
		"Your document text string goes here",
		"You can send multiple texts",
		"Each text can be up to 8192 tokens long"
    ]}'  > document-128.json

query-128.json

curl 'https://api.jina.ai/v1/multi-vector' \
	 -H 'Content-Type: application/json' \
	 -H 'Authorization: Bearer <JINA_TOKEN>' \
	 -d '{
	"model": "jina-colbert-v2",
	"dimensions": 128,
	"input_type": "query",
	"embedding_type": "float",
	"input": [
		"Your document text string goes here",
		"You can send multiple texts",
		"Each text can be up to 8192 tokens long"
    ]}'  > query-128.json

Take a look into query-128.json and document-128.json. The same text has different vector. This behavior is different from infinity implementation. infinity return the same vector.

Rerank

And jina-colbert-v2 also has /rerank api, but swagger ui returns:

{
  "error": {
    "message": "ModelNotDeployedError: model=`jinaai/jina-colbert-v2` does not support `rerank`. Reason: the loaded moded cannot fullyfill `rerank`. Options are {'embed'}.",
    "type": null,
    "param": null,
    "code": 400
  }
}

here is jina result: rank.json

curl 'https://api.jina.ai/v1/rerank' \
	 -H 'Content-Type: application/json' \
	 -H 'Authorization: Bearer <JINA_TOKEN>' \
	 -d '{
      "model": "jina-colbert-v2",
      "query": "document text",
      "top_n": 3,
      "documents": [
            "Your document text string goes here",
       		"You can send multiple texts",
       		"Each text can be up to 8192 tokens long"
       ]
    }'  > rank.json

Mar 17 '25 16:03 irelance

You can't rerank with this model on infinity. You need to write your own reranking operation.

Reranking with large states is a database operation, infinity is a stateless inference server.

Mar 17 '25 16:03 michaelfeil