colpali
colpali copied to clipboard
Why not remove describe text embedding in image_embeddings output
I am using the vidore/colqwen2-v0.1 model to embed documents and queries. In the code, I noticed that you removed pixel_values during the query processing, but you didn't remove the described embedding during the image processing. Does this affect the results of late interaction search?