langchain
langchain copied to clipboard
DOC: Get only the sql query not the output of the query in SQLDatabaseChain
Issue with current documentation:
Hi, I need to get just the query only for a natural language input query. I don't want to get the results/output of the query from the db, how do I do that?
Thanks!
Idea or request for content:
No response
I would also like to know
I have this need as well. The patch is straightforward (see https://github.com/hwchase17/langchain/blob/master/langchain/chains/sql_database/base.py) and I'm happy to open a PR on this if one of the maintainers (@imeckr ? @hwchase17 ?) wants to comment on the best approach.
I'd propose setting a return_sql=True
similar to return_direct=True
, and more-or-less implementing how return_direct
works.
(My use case is that I want to do some processing of the SQL query before execution.)
I am not maintainer but I find the feature request super useful. For what I can see I would recommend creating another chain SQLQueryGenChain
and not introduce a new parameter, also use it inSQLDatabaseChain
(composability). Though this approach would be different from the existing approach where depending upon parameters such as return_direct
or use_query_checker
llm chains are being called inside the _call
method. And, if we would have a SQLQueryCheckerChain
it allows independent use as will as inside SQLDatabaseChain
as well.
What do you think @hwchase17 @dev2049 ? I could pick this up if you find it interesting.
Isn't it possible to actually duplicate the functionality of the sql generator with a normal LLMChain
?
Yes very much possible but it would be great to have it handy though.
hi, I was looking to add an additional function that could be passed to the SQLDatabaseChain. The idea being that this function is passed the generated sql and could sanity check it before it is executed. Currently the flat query_checker_prompt
would send the resulting sql back to llm to verify it's correct. What I would like to do is to be able to check, using something like sqlparse, that the syntax is correct or that the query that is returned could be adjusted for additional filtering/checking.
Happy to submit a PR for this, but would like to have suggestions on how best this ought to be implemented.
Please see the change here:
https://github.com/lingster/langchain/commit/8909a0afdf03b2353e4239a3482f8efe56bf2bd6
What I did instead was:
- Goto the file
langchain/chains/sql_database/base.py
and setreturn_intermediate_steps to
True
class SQLDatabaseChain(Chain):
"""Chain for interacting with SQL Database.
Example:
.. code-block:: python
from langchain import SQLDatabaseChain, OpenAI, SQLDatabase
db = SQLDatabase(...)
db_chain = SQLDatabaseChain(llm=OpenAI(), database=db)
"""
llm: BaseLanguageModel
"""LLM wrapper to use."""
database: SQLDatabase = Field(exclude=True)
"""SQL Database to connect to."""
prompt: BasePromptTemplate = PROMPT
"""Prompt to use to translate natural language to SQL."""
top_k: int = 5
"""Number of results to return from the query"""
input_key: str = "query" #: :meta private:
output_key: str = "result" #: :meta private:
return_intermediate_steps: bool = True
"""Whether or not to return the intermediate steps along with the final answer."""
return_direct: bool = False
"""Whether or not to return the result of querying the SQL table directly."""
- Goto the file
langchain/chains/base.py
and updatedrun
function as:
def run(self, *args: Any, **kwargs: Any) -> str:
"""Run the chain as text in, text out or multiple variables, text out."""
# if len(self.output_keys) != 1:
# raise ValueError(
# f"`run` not supported when there is not exactly "
# f"one output key. Got {self.output_keys}."
# )
if args and not kwargs:
if len(args) != 1:
raise ValueError("`run` supports only one positional argument.")
return self(args[0])[self.output_keys[0]], self(args[0])[self.output_keys[1]]
if kwargs and not args:
return self(kwargs)[self.output_keys[0]]
raise ValueError(
f"`run` supported with either positional arguments or keyword arguments"
f" but not both. Got args: {args} and kwargs: {kwargs}."
)
Hence, to make a request, I just run this:
db = SQLDatabase.from_uri(os.getenv('POSTGRES_URI'))
llm = OpenAI(model_name="text-davinci-003", temperature=0.0)
db_chain = SQLDatabaseChain(llm=llm, database=db, verbose=True)
output, sql_cmd = db_chain.run(str(query))
Hence, it returns both the results and the sql query.
Thanks @olahsymbo This is a great solution
Closing the issue.
I don't think this issue should be closed, as it requires code changes to mainline with every release.
Yes, but it still executes the query.
@gopesh97, I think the issue shouldn't be closed. The accepted solution executes the query instead of just returning it before execution.
I will be working on making this feature a possibility.
@Ed1123 agreed, I have created a PR. This issue should be reopened.
I'd like to do the same thing but using the MultiQueryRetriever. I can't return only the generated queries, it asks for a "run_manager" parameter.
Any solution to this problem without having to edit source code?
Please reopen
Please reopen