infinity icon indicating copy to clipboard operation
infinity copied to clipboard

Is ColBERT model provide the right result for query and document vector embedding and rerank?

Open irelance opened this issue 9 months ago • 1 comments

System Info

docker docker.io/michaelf34/infinity:0.0.75

not set gpu

Model: jinaai/jina-colbert-v2

Information

  • [ ] Docker + cli
  • [ ] pip + cli
  • [ ] pip + usage of Python interface

Tasks

  • [ ] An officially supported CLI command
  • [ ] My own modifications

Reproduction

Reference

https://jina.ai/news/jina-colbert-v2-multilingual-late-interaction-retriever-for-embedding-and-reranking/

Embeddings

It said that there is an input_type params (query|document) return different Embeddings.

But i found that input_type is useless here. dimensions size return by token size And dimensions set a big size for small token size will return error

and the shape is always [f(token.size), 1028], if set dimensions returns shape [dimensions,1028]

However I have try the jina api: https://jina.ai/api-dashboard/embedding and make test here (replace <JINA_TOKEN> below): It returns shape [f(token.size),dimensions] for document, and [32,dimensions] for query.

document-64.json

curl 'https://api.jina.ai/v1/multi-vector' \
	 -H 'Content-Type: application/json' \
	 -H 'Authorization: Bearer <JINA_TOKEN>' \
	 -d '{
	"model": "jina-colbert-v2",
	"dimensions": 64,
	"input_type": "document",
	"embedding_type": "float",
	"input": [
		"Your document text string goes here",
		"You can send multiple texts",
		"Each text can be up to 8192 tokens long"
    ]}'  > document-64.json

document-128.json

curl 'https://api.jina.ai/v1/multi-vector' \
	 -H 'Content-Type: application/json' \
	 -H 'Authorization: Bearer <JINA_TOKEN>' \
	 -d '{
	"model": "jina-colbert-v2",
	"dimensions": 128,
	"input_type": "document",
	"embedding_type": "float",
	"input": [
		"Your document text string goes here",
		"You can send multiple texts",
		"Each text can be up to 8192 tokens long"
    ]}'  > document-128.json

query-128.json

curl 'https://api.jina.ai/v1/multi-vector' \
	 -H 'Content-Type: application/json' \
	 -H 'Authorization: Bearer <JINA_TOKEN>' \
	 -d '{
	"model": "jina-colbert-v2",
	"dimensions": 128,
	"input_type": "query",
	"embedding_type": "float",
	"input": [
		"Your document text string goes here",
		"You can send multiple texts",
		"Each text can be up to 8192 tokens long"
    ]}'  > query-128.json

Take a look into query-128.json and document-128.json. The same text has different vector. This behavior is different from infinity implementation. infinity return the same vector.

Rerank

And jina-colbert-v2 also has /rerank api, but swagger ui returns:

{
  "error": {
    "message": "ModelNotDeployedError: model=`jinaai/jina-colbert-v2` does not support `rerank`. Reason: the loaded moded cannot fullyfill `rerank`. Options are {'embed'}.",
    "type": null,
    "param": null,
    "code": 400
  }
}

here is jina result: rank.json

curl 'https://api.jina.ai/v1/rerank' \
	 -H 'Content-Type: application/json' \
	 -H 'Authorization: Bearer <JINA_TOKEN>' \
	 -d '{
      "model": "jina-colbert-v2",
      "query": "document text",
      "top_n": 3,
      "documents": [
            "Your document text string goes here",
       		"You can send multiple texts",
       		"Each text can be up to 8192 tokens long"
       ]
    }'  > rank.json

irelance avatar Mar 17 '25 16:03 irelance

You can't rerank with this model on infinity. You need to write your own reranking operation.

Reranking with large states is a database operation, infinity is a stateless inference server.

michaelfeil avatar Mar 17 '25 16:03 michaelfeil