arcadia icon indicating copy to clipboard operation
arcadia copied to clipboard

Able to use sql data in kubeagi

Open bjwswang opened this issue 6 months ago • 1 comments

SQL is the most popular way to store/query strucutred data. Almost all online serivce host their data in relational/non-relational database. That means it is worthy to support this kind of data ingestion in our kubeagi system.

With SQL ingestions supported ,we can connect LLM to SQL

Use case

With sql data ingested,we can :

  1. [Text To SQL]Generating queries that will be run based on natural language questions

Say we have database which provide city stats.When user asked a question like

Which city has the highest population?

We can geneate a sql query based on that question and retrieve data from sql database.

Similar to https://docs.llamaindex.ai/en/stable/examples/index_structs/struct_indices/SQLIndexDemo.html#part-1-text-to-sql-query-engine

  1. [ SQL Retriever] Creating chatbots that can answer questions based on database data

When we have multiple tables which provide various data, we can simply ingest the data into vectorstore with enhanced index.

Then we have a chatbot and ask questions to it,we can do similarity search on the question to ingested sql data and ask llm to generate response with the sql context data.

Similar to https://docs.llamaindex.ai/en/stable/examples/index_structs/struct_indices/SQLIndexDemo.html#part-2-query-time-retrieval-of-tables-for-text-to-sql

  1. Building custom dashboards based on insights a user wants to analyze

This is more similar to what db-gpt is doing right now. Analyze sql data with the help of LLM.

Plan

For the above 3 use cases, we can focus on the former two which all enhances our QA system

  1. text to sql
  2. ingest sql data as a retriever

For the databases we should support, I suggest postgresql and mysql.

bjwswang avatar Jan 22 '24 03:01 bjwswang

@ggservice007 @nkwangleiGIT @wangxinbiao Please leave your comments.

bjwswang avatar Feb 22 '24 02:02 bjwswang