langchain icon indicating copy to clipboard operation
langchain copied to clipboard

Fix hallucination problem with `SQLDatabaseChain`

Open ajndkr opened this issue 2 years ago • 16 comments

Description

Based on my experiments with SQLDatabaseChain class, I have noticed that the LLM tends to hallucinate Answer if SQLResult field results as []. As I primarily used OpenAI as the LLM, I am not sure if this problem exists for other LLMs.

This is surely an undesirable outcome for users if the agent responds back with incorrect answers. With some prompt engineering, I think the following can be added to the default prompt to tackle this problem:

If the SQLResult is empty, the Answer should be "No results found". DO NOT hallucinate an answer if there is no result.

If there are better prompts to tackle this, that's great too!

ajndkr avatar Feb 23 '23 18:02 ajndkr

I would like to contribute if this fix makes sense. @hwchase17

ajndkr avatar Feb 23 '23 18:02 ajndkr

cc @fpingham who did a lot of work on SqlDatabaseChain... does this seem reasonable?

agola11 avatar Mar 10 '23 06:03 agola11

I've faced this problem as well and I think this would be a good addition to the prompt. @ajndkr did you try it out? Did it solve the problem for you?

fpingham avatar Mar 10 '23 19:03 fpingham

it worked for me but I have to tweak the agent prefix a bit as well.

ajndkr avatar Mar 10 '23 20:03 ajndkr

Do you mean to use the agent instead of the chain? I understand that these are independent.

fpingham avatar Mar 11 '23 17:03 fpingham

sorry, I don't follow the question. Does it matter if I use an agent or chain? Wouldn't the fix be limited to the SqlDatabaseChain as a tool?

ajndkr avatar Mar 11 '23 17:03 ajndkr

AFAIK the agent does not use the chain but instead uses the SQLDatabase object directly

fpingham avatar Mar 13 '23 20:03 fpingham

Ah! now I understand, you are talking about the SQL agent. I think I opened this issue before agent toolkits were introduced.

Maybe the SQL agent doesn't have a hallunication problem but if SQLDatabaseChain is separate from it, we could update it with the suggested change.

ajndkr avatar Mar 14 '23 07:03 ajndkr

Yeah since you said 'the agent prefix' I thought you referred to the agent. I think the agent does not have this problem because it has some kind of instruction not to invent if the tables do not give it an answer. I would go through with this for the chain 👍

fpingham avatar Mar 14 '23 14:03 fpingham

super.

ajndkr avatar Mar 14 '23 14:03 ajndkr

Using SQLDatabaseChain, and Chinook.db . If i ask : "What is my name?" , I get as an answer: SQLQuery: SELECT FirstName, LastName FROM Employee WHERE EmployeeId = 3 SQLResult: [('Jane', 'Peacock')]

Tried to add to the end of the prompt , some alerts, but without success.

rubensmau avatar Mar 15 '23 15:03 rubensmau

Using SQLDatabaseChain, and Chinook.db . If i ask : "What is my name?" , I get as an answer: SQLQuery: SELECT FirstName, LastName FROM Employee WHERE EmployeeId = 3 SQLResult: [('Jane', 'Peacock')]

Tried to add to the end of the prompt , some alerts, but without success.

what's the expected answer here? @rubensmau

ajndkr avatar Mar 15 '23 15:03 ajndkr

I would like to get an answer : " I don´t know"

rubensmau avatar Mar 15 '23 15:03 rubensmau

Based on my suggestion, the answer will be "I don't know" only when SQLResult: []. As the above example is not an empty list, my prompt suggestion won't work.

ajndkr avatar Mar 15 '23 15:03 ajndkr

@rubensmau for your use case, you can look into https://github.com/hwchase17/langchain/blob/ced412e1c12586a032eef937b97f195b9542c6d3/langchain/agents/agent_toolkits/sql/prompt.py#L14

ajndkr avatar Mar 15 '23 15:03 ajndkr

Great answer @ajndkr , solved my problem. Added your text before the user query, and I got as answer: SQLQuery: N/A(sqlite3.OperationalError) near "N": syntax error [SQL: N/A]

captured this answer as an exception, and returned to the user : "I don´t know" .

Many thanks

rubensmau avatar Mar 15 '23 17:03 rubensmau

Hi, @ajndkr! I'm Dosu, and I'm helping the LangChain team manage their backlog. I wanted to let you know that we are marking this issue as stale.

From what I understand, the issue is about a problem with the SQLDatabaseChain class where the language model tends to provide incorrect answers when the SQLResult field is empty. You suggested adding a prompt to handle this issue by specifying that if the SQLResult is empty, the answer should be "No results found" instead of hallucinating an answer.

There was a discussion between you and the maintainers, with one maintainer confirming that the fix makes sense and another suggesting using the SQL agent instead of the chain. Additionally, you provided a solution that solved another user's use case.

Before we close this issue, we wanted to check with you if it is still relevant to the latest version of the LangChain repository. If it is, please let us know by commenting on the issue. Otherwise, feel free to close the issue yourself or it will be automatically closed in 7 days.

Thank you for your contribution to the LangChain repository!

dosubot[bot] avatar Sep 21 '23 16:09 dosubot[bot]