spring-ai icon indicating copy to clipboard operation
spring-ai copied to clipboard

[Pinecone] need more distance metric support

Open cheedonghu opened this issue 10 months ago • 3 comments

I followed this tuition to set my Pinecone database, But I failed to get results like the example.

So, I checked the file PineconeVectorStorec.class and noticed that the similarity search function had a filter to response's score which were negative numbers in my results because my Pinecone Index used cosine as distance metric.

return queryResponse.getMatchesList()
	.stream()
	.filter(scoredVector -> scoredVector.getScore() >= request.getSimilarityThreshold()) // here my socres are negative numbers
	.map(scoredVector -> {
		var id = scoredVector.getId();
		Struct metadataStruct = scoredVector.getMetadata();
		var content = metadataStruct.getFieldsOrThrow(CONTENT_FIELD_NAME).getStringValue();
		Map<String, Object> metadata = extractMetadata(metadataStruct);
		metadata.put(DISTANCE_METADATA_FIELD_NAME, 1 - scoredVector.getScore());
		return new Document(id, content, metadata);
	})
	.toList();

Expected Behavior Get the correct results.

Current Behavior Get nothing.

Context spring-ai.version is 0.8.1

Could the pinecone add the configuration distance-type as Neo4jVectorStore properties to allow custom distance metric?

Thanks!

cheedonghu avatar Apr 21 '24 02:04 cheedonghu

Ok, understood, moving to M2 for now to support whatever other similarity algorithms that pinecone has.

BTW, it is valid to have negative numbers for cosine similarity

markpollack avatar May 24 '24 14:05 markpollack

@cheedonghu I am taking a look at this issue. From what I understand, in the case of Pinecone, you specify the similarity metric when you create the index (as seen in this example). How can we specify the metric when we do similarity search? Is that what you meant in your original request? Thanks.

sobychacko avatar Jul 30 '24 23:07 sobychacko

How can we specify the metric when we do similarity search?

Yes, I chose cosine as my metric at that time and found that I couldn't get the result because of the code I posted (request.getSimilarityThreshold()), The scores I got were negative numbers. So I realized that SpringAI may only support one type of similarity metric in Pinecone, and then oepned this issue.

cheedonghu avatar Aug 05 '24 07:08 cheedonghu

@cheedonghu Sorry for not getting back to you sooner. My understanding is that you set the metric dimension when you set up the index on Pinecone. Then, in the pinecone vector store implementation in spring-ai, we specify the index name and do the similarity search based on what you set on the server. It doesn't look like the search implementation does not have control over what metric dimension to use.

This test passes on our end: https://github.com/spring-projects/spring-ai/blob/main/vector-stores/spring-ai-pinecone-store/src/test/java/org/springframework/ai/vectorstore/PineconeVectorStoreIT.java. I used the default cosine metric when the spring-ai-test-index was created on the Pinecone server.

Can you take that test and see if it works for you? Could you try creating the index with various metric types on the server and run the test? If there are issues, please let us know.

sobychacko avatar Aug 17 '24 00:08 sobychacko

This test passes on our end: https://github.com/spring-projects/spring-ai/blob/main/vector-stores/spring-ai-pinecone-store/src/test/java/org/springframework/ai/vectorstore/PineconeVectorStoreIT.java. I used the default cosine metric when the spring-ai-test-index was created on the Pinecone server.

@sobychacko I tried running the test class, but the dimension for the test is 384, whereas mine is 4096( and choose dotproduct as metric) because I'm using Ollama 3.1. I attempted to modify the dimension, but failed. then I returned to my project and updated the SpringAI version from 0.8.1 to the latest to check if that resolved the problem I posted in this issue. Thankfully, the function worked correctly. Although I still have some questions, I think the issue can be closed as fixed. I appreciate the hard work you all have put in.

cheedonghu avatar Aug 25 '24 02:08 cheedonghu