[Query question] How the vanna ai get the correct value in database table?
First of all, thank you for your efforts. I have a question, how do you ensure that vanna ai gets the correct data, assuming there are ten tables, each table has ten fields, and each table has more than 10,000 data. If I donβt know the table name and field name, just ask a question related to the value of a data item inside, how can AI answer it?
I have a problem similar to yours. And I found that none of the text2sql models/studies explained how to get and map the question content to the specific value of the data in the database table. Take training_data in this project as an example:
{ "question":"What are the 10 countries with highest government debt in 2008 ? ", "answer":"SELECT geo_name,\r\n value\r\nFROM data_commons_public_data.cybersyn.timeseries join data_commons_public_data.cybersyn.geo_index\r\n ON timeseries.geo_id = geo_index.id\r\nWHERE variable_name = 'Amount of Debt: Government'\r\n and date = '2008-01-01'\r\n and geo_id like 'country\/%'\r\nORDER BY value desc limit 10;" }
How can we know the variable_name field has a value Amount of Debt: Government but not government debt?
I have a problem similar to yours. And I found that none of the text2sql models/studies explained how to get and map the question content to the specific value of the data in the database table. Take training_data in this project as an example:
{ "question":"What are the 10 countries with highest government debt in 2008 ? ", "answer":"SELECT geo_name,\r\n value\r\nFROM data_commons_public_data.cybersyn.timeseries join data_commons_public_data.cybersyn.geo_index\r\n ON timeseries.geo_id = geo_index.id\r\nWHERE variable_name = 'Amount of Debt: Government'\r\n and date = '2008-01-01'\r\n and geo_id like 'country\/%'\r\nORDER BY value desc limit 10;" }How can we know thevariable_namefield has a valueAmount of Debt: Governmentbut notgovernment debt?
what this project has done maybe is just add domentation to rectify the answer.
But from my experience , we can just add distinct value of field to the documentation.