spring-ai icon indicating copy to clipboard operation
spring-ai copied to clipboard

[Bug] Not getting the Usage details (promptTokens, completionTokens, and totalTokens) when running a simple RAG app with local PGVector DB and Spring AI 1.0.0-M2, same is working fine with M1.

Open Reynolds045 opened this issue 1 year ago • 1 comments

Bug description I am getting 0s for (promptTokens, completionTokens, and totalTokens) when trying to get the token consumption for a particular query.

This is happening in a simple RAG application backed by Spring AI 1.0.0-M2, PG vector database and spring-ai-openai-spring-boot-starter package.

I am setting up ChatClient object with advisors QuestionAnswerAdvisor(vectorStore, SearchRequest.defaults()) and MessageChatMemoryAdvisor(new InMemoryChatMemory())

I am getting ChatResponse object with EmptyUsage object inside, and thus zeros for promptTokens, completionTokens, and totalTokens.

Just dropped the version of Spring AI from M2 to M1 to test, and I am getting the token consumption details in M1.

Otherwise when I am commenting out QuestionAnswerAdvisor(vectorStore, SearchRequest.defaults()) than also I am getting the token consumption in the console.

As now I am getting Usage as part of ChatResponse instead of EmptyUsage.

Environment MS Windows 10, Spring AI version 1.0.0-M2, Java version 21, locally running postgres instance's vector store. Spring Boot version, 3.3.4.

Note: I am NOT using Docker compose.

Steps to reproduce Just setup a basic spring boot project with dependencies - spring-boot-starter-web, spring-ai-openai-spring-boot-starter, spring-ai-tika-document-reader, spring-ai-pgvector-store-spring-boot-starter, and for DB - postgresql, spring-boot-starter-jdbc.

Open 2 endpoints one for uploading PDF files and other to ask questions around those, a typical RAG flow. One can System.out.println ChatResponse object to see we get EmptyUsage object when using Spring AI M2 and Usage object when using Spring AI M1 instead.

Expected behavior I guess even in M2 version as well the token consumption details should be provided in the ChatResponse object with Usage inside it instead of EmptyUsage.

Minimal Complete Reproducible example https://github.com/Reynolds045/simpleRag

follow the README.md in there.

Reynolds045 avatar Sep 24 '24 18:09 Reynolds045