fitsleepinsights icon indicating copy to clipboard operation
fitsleepinsights copied to clipboard

Current RAG limitations

Open galeone opened this issue 2 years ago • 0 comments

The current RAG implementation is limited and more than often provides wrong answers.

It works in this way:

Data Preparation

The structured data is converted into a report. The embedding of the report is computed using model gecko 003

Problems

  • The report contains only a small subset of the information available.
  • The report is only by day (daily report)
  • The report is populated by the LLM that sometimes inserts generated information instead of extracting (as requested) from the JSON data provided
  • The embedding is only one because the document is short and the model can convert it to a vector - is this good or should we use a window-based approach?

Data insertion

The report and the embeddings are stored inside the database.

Document Retrieval

The user asks a question. The question is embedded and a similarity search is done, the first 3 documents are retrieved and passed as Gemini as context.

Problems

  • How can we be sure that the documents are relevant? The similarity measure although provides a good numerical score, sometimes returns wrong documents (e.g. question asks for activities on March 8, and the model retrieves as the first document the report of March 18)
  • 3 is totally arbitrary number. It's wrong on so many levels:
    • When asking for a single day, we are giving (at least) 2 other days for no reason
    • When asking for data that should span for multiple days, the context si limited to 3.

Potential solutions

These are not "solutions" just some tweaks/changes that potentially can lead the LLM to work on better data - at least.

  • [ ] Improve the reports: manually populate them instead of asking an LLM to do the magic. Write all the queries and improve the sections of the reports, extracting as much useful information as possible.
  • [ ] The reports should contain the questions that the user could ask
  • [ ] The reports should be aggregated and we should be able to generate weekly reports and monthly reports because the users are likely to ask for this type of information

Questions

  • How can we fetch all the relevant information and be sure to have fetched the correct data?
  • How can the LLM interpolate/aggregate data among different reports and extract correct values? It looks like LLMs are not good at doing calculations
  • How can we make the LLM look also at the data in the database - directly - if needed?

galeone avatar Apr 10 '24 16:04 galeone