spring-ai icon indicating copy to clipboard operation
spring-ai copied to clipboard

Refactor Separation of embedding logic through the DocumentTransformer

Open youngmoneee opened this issue 1 year ago • 1 comments

This PR aims to achieve two objectives through the proposed changes:

  1. Separate the common embedding logic present in the VectorStore implementations using the DocumentTransformer interface. By isolating the logic that adds embedding data before inserting Documents into the VectorStore, maintainability and testability are improved.
  2. Improve batch processing performance by executing blocking operations asynchronously. Sequential and synchronous Embedding Request tasks are executed on a separate Scheduler using Reactor, leading to enhanced performance.”

https://github.com/spring-projects/spring-ai/blob/10e1e13fa204b2f634ee874fcee2360f94f18185/vector-stores/spring-ai-weaviate-store/src/main/java/org/springframework/ai/vectorstore/WeaviateVectorStore.java#L327

In the example code, the map operation synchronously performs the next task only after the previous task has been completed.

https://github.com/spring-projects/spring-ai/blob/10e1e13fa204b2f634ee874fcee2360f94f18185/vector-stores/spring-ai-weaviate-store/src/main/java/org/springframework/ai/vectorstore/WeaviateVectorStore.java#L363-L368

https://github.com/spring-projects/spring-ai/blob/10e1e13fa204b2f634ee874fcee2360f94f18185/spring-ai-core/src/main/java/org/springframework/ai/embedding/EmbeddingModel.java#L55-L62

The call method synchronously requests an EmbeddingResponse object, creating a significant bottleneck due to the sequential execution of these blocking methods.

For comparison, when embedding and inserting the same 100 Document objects into a vector database, the original code took 106 seconds.

Screenshot 2024-08-19 at 12 57 38 AM

https://github.com/spring-projects/spring-ai/blob/eb58cf4d0a4e0f1f1cd51a10dfa595315513f4fe/spring-ai-core/src/main/java/org/springframework/ai/transformer/DocumentEmbeddingTransformer.java#L49-L59

To decrease this bottleneck, the code internally uses Reactor objects to execute these blocking methods asynchronously, minimizing the need for major code modifications.

Screenshot 2024-08-19 at 1 01 24 AM

And, after modifying the code to process the tasks on a separate asynchronous scheduler, the execution time was reduced to 8.6 seconds, representing a 92% decrease in processing time.


This PR aimed to optimize performance with minimal changes to the existing code. However, in the long term, I think that expressing the ETL pipeline as a stream rather than batch processing through a List would be more appropriate.

I have created an issue( #1219 ) related to this topic. I would appreciate any insights or thoughts you might have.

It would be great if you could take a look at the issue when you have time.

Thanks 🧑🏼‍💻

youngmoneee avatar Aug 18 '24 19:08 youngmoneee

review in light of https://github.com/spring-projects/spring-ai/commit/087de16cfc4f6e2d646ebaafeadf45140ee75752

markpollack avatar Sep 17 '24 18:09 markpollack