langchain icon indicating copy to clipboard operation
langchain copied to clipboard

DOC: Get only the sql query not the output of the query in SQLDatabaseChain

Open gopesh97 opened this issue 1 year ago • 6 comments

Issue with current documentation:

Hi, I need to get just the query only for a natural language input query. I don't want to get the results/output of the query from the db, how do I do that?

Thanks!

Idea or request for content:

No response

gopesh97 avatar May 17 '23 12:05 gopesh97

I would also like to know

Tajcore avatar May 17 '23 12:05 Tajcore

I have this need as well. The patch is straightforward (see https://github.com/hwchase17/langchain/blob/master/langchain/chains/sql_database/base.py) and I'm happy to open a PR on this if one of the maintainers (@imeckr ? @hwchase17 ?) wants to comment on the best approach.

I'd propose setting a return_sql=True similar to return_direct=True, and more-or-less implementing how return_direct works.

(My use case is that I want to do some processing of the SQL query before execution.)

richarddli avatar May 17 '23 21:05 richarddli

I am not maintainer but I find the feature request super useful. For what I can see I would recommend creating another chain SQLQueryGenChain and not introduce a new parameter, also use it inSQLDatabaseChain (composability). Though this approach would be different from the existing approach where depending upon parameters such as return_direct or use_query_checker llm chains are being called inside the _call method. And, if we would have a SQLQueryCheckerChain it allows independent use as will as inside SQLDatabaseChain as well.

What do you think @hwchase17 @dev2049 ? I could pick this up if you find it interesting.

imeckr avatar May 18 '23 11:05 imeckr

Isn't it possible to actually duplicate the functionality of the sql generator with a normal LLMChain?

Tajcore avatar May 18 '23 14:05 Tajcore

Yes very much possible but it would be great to have it handy though.

imeckr avatar May 19 '23 07:05 imeckr

hi, I was looking to add an additional function that could be passed to the SQLDatabaseChain. The idea being that this function is passed the generated sql and could sanity check it before it is executed. Currently the flat query_checker_prompt would send the resulting sql back to llm to verify it's correct. What I would like to do is to be able to check, using something like sqlparse, that the syntax is correct or that the query that is returned could be adjusted for additional filtering/checking.

Happy to submit a PR for this, but would like to have suggestions on how best this ought to be implemented.

Please see the change here:

https://github.com/lingster/langchain/commit/8909a0afdf03b2353e4239a3482f8efe56bf2bd6

lingster avatar May 20 '23 15:05 lingster

What I did instead was:

  1. Goto the file langchain/chains/sql_database/base.py and set return_intermediate_steps to True
class SQLDatabaseChain(Chain):
    """Chain for interacting with SQL Database.

    Example:
        .. code-block:: python

            from langchain import SQLDatabaseChain, OpenAI, SQLDatabase
            db = SQLDatabase(...)
            db_chain = SQLDatabaseChain(llm=OpenAI(), database=db)
    """

    llm: BaseLanguageModel
    """LLM wrapper to use."""
    database: SQLDatabase = Field(exclude=True)
    """SQL Database to connect to."""
    prompt: BasePromptTemplate = PROMPT
    """Prompt to use to translate natural language to SQL."""
    top_k: int = 5
    """Number of results to return from the query"""
    input_key: str = "query"  #: :meta private:
    output_key: str = "result"  #: :meta private:
    return_intermediate_steps: bool = True
    """Whether or not to return the intermediate steps along with the final answer."""
    return_direct: bool = False
    """Whether or not to return the result of querying the SQL table directly."""
  1. Goto the file langchain/chains/base.py and updated run function as:
    def run(self, *args: Any, **kwargs: Any) -> str:
        """Run the chain as text in, text out or multiple variables, text out."""
        # if len(self.output_keys) != 1:
        #     raise ValueError(
        #         f"`run` not supported when there is not exactly "
        #         f"one output key. Got {self.output_keys}."
        #     )

        if args and not kwargs:
            if len(args) != 1:
                raise ValueError("`run` supports only one positional argument.")
            
            return self(args[0])[self.output_keys[0]], self(args[0])[self.output_keys[1]]

        if kwargs and not args:
            return self(kwargs)[self.output_keys[0]]

        raise ValueError(
            f"`run` supported with either positional arguments or keyword arguments"
            f" but not both. Got args: {args} and kwargs: {kwargs}."
        )

Hence, to make a request, I just run this:

db = SQLDatabase.from_uri(os.getenv('POSTGRES_URI'))
llm = OpenAI(model_name="text-davinci-003", temperature=0.0)
db_chain = SQLDatabaseChain(llm=llm, database=db, verbose=True)
output, sql_cmd = db_chain.run(str(query))

Hence, it returns both the results and the sql query.

olahsymbo avatar Jun 17 '23 12:06 olahsymbo

Thanks @olahsymbo This is a great solution

Closing the issue.

gopesh97 avatar Jun 26 '23 12:06 gopesh97

I don't think this issue should be closed, as it requires code changes to mainline with every release.

richarddli avatar Jun 26 '23 15:06 richarddli

Yes, but it still executes the query.

richarddli avatar Jun 28 '23 11:06 richarddli

@gopesh97, I think the issue shouldn't be closed. The accepted solution executes the query instead of just returning it before execution.

Ed1123 avatar Jul 08 '23 23:07 Ed1123

I will be working on making this feature a possibility.

keenborder786 avatar Jul 19 '23 14:07 keenborder786

@Ed1123 agreed, I have created a PR. This issue should be reopened.

keenborder786 avatar Jul 19 '23 15:07 keenborder786

I'd like to do the same thing but using the MultiQueryRetriever. I can't return only the generated queries, it asks for a "run_manager" parameter.

GustavoContreiras avatar Aug 15 '23 20:08 GustavoContreiras

Any solution to this problem without having to edit source code?

mfernandezsidn avatar Jan 28 '24 21:01 mfernandezsidn

Please reopen

pmatrasc avatar Mar 13 '24 20:03 pmatrasc

Please reopen

r4kh4t avatar Jun 29 '24 15:06 r4kh4t