langchain Issue: Token Limit Exceeded Error in SQL Database Agent

Issue you'd like to raise.

I am trying to work with Snowflake using create_sql_agent Very often getting token limit error.

This is my code

from langchain.agents import create_sql_agent
from langchain.agents.agent_toolkits import SQLDatabaseToolkit
from langchain.sql_database import SQLDatabase
from langchain.llms.openai import OpenAI
from langchain.agents import AgentExecutor

from sqlalchemy.dialects import registry
registry.load("snowflake")

account_identifier = 'xxxx'
user = 'xxxx'
password = 'xxxx'
database_name = 'xxxx'
schema_name = 'xxxx'
warehouse_name = 'xxxx'
role_name = 'xxxx'

conn_string = f"snowflake://{user}:{password}@{account_identifier}/{database_name}/{schema_name}?warehouse={warehouse_name}&role={role_name}"
db = SQLDatabase.from_uri(conn_string)
print("DB===", db)
toolkit = SQLDatabaseToolkit(llm=OpenAI(temperature=0), db=db)

agent_executor = create_sql_agent(
    llm=OpenAI(temperature=0),
    toolkit=toolkit,
    verbose=True
)

agent_executor.run("Which companies are getting the most reviews in a specific category?")

If I ask straightforward question on a tiny table that has only 5 records, Then the agent is running well. If the table is slightly bigger with complex question, It throws InvalidRequestError: This model's maximum context length is 4097 tokens, however you requested 13719 tokens (13463 in your prompt; 256 for the completion). Please reduce your prompt; or completion length.

Suggestion:

No response

May 07 '23 18:05 jeyarajcs

I think the issue is from SQLDatabaseToolkit. This schema_sql_db tool takes comma-separated list of tables as input, output is the schema and sample rows for those tables. If the query is a bit lengthy or complex, then this tool returns pretty much all the table schema with sample rows. I think that's where we hit the token limit error.

But I don't know How to deal with this issue?

May 07 '23 18:05 jeyarajcs

I am also facing the same issue

May 08 '23 17:05 ijasnahamed

@hwchase17 Any thoughts? Any workaround to pass this?

May 08 '23 18:05 jeyarajcs

There are a couple of parameters you can adjust to limit the table information that's going to be included in the prompt.

SQLDatabase.from_uri(
    conn_string,
    sample_rows_in_table_info=1, # default 3
    include_tables=["my_table_1"],  # will restrict to just using my_table_1
)

You can also use ignore_tables to limit the tables to include in the table info.

May 09 '23 08:05 altcoder

I am facing the same issue, I have included two tables. sample_rows_in_table_info=1, # default 3 include_tables=['tblSales', 'tblCustomer'] The customer table contains 125 columns The Sales table contains 102 columns.

Question: List top 10 customers based on total sales in this year. This will link both tables and thus fields of both tables will be sent to OpenAI.

openai.error.InvalidRequestError: This model's maximum context length is 4097 tokens, however you requested 7603 tokens (7347 in your prompt; 256 for the completion). Please reduce your prompt; or completion length.

May 12 '23 14:05 rajendradewani

I have the same issue. I have tables that have 100+ columns and the prompts being generated are just too large.

I might be crazy, but I am thinking about putting a vector db in front of this and then having the query first pull out the columns it thinks are relevant before building up the prompt.

May 25 '23 17:05 steven-levey-sp

The agent has to query the list of tables on the database, then the schema for the most relevant tables. If you have lots of relevant sounding tables, or big schemas for some tables, you hit the limit of the prompt length.

Immediate options are:

decrease the columns on your tables
try to get access to ChatGPT4 with 8k prompts instead of 4k.
create a view onto your table with just a few columns, then maybe adapt the langchain code if necessary to make the agent consider the view rather than the table.
Give up using langchain and use ChatGPT as a few-shot learner : give it example(s) of SQL queries that do work on your tables to answer various natural language queries, and then see whether it does a better job of building new SQL.

Jun 04 '23 00:06 willsmithorg

Hi Will! Is there a method for tracking token usage as the program runs, specifically with a SQL agent? It's very difficult to identify the source of large token usage. I feel like if there was a way to monitor this, it would give me better insights when it comes to handling this problem.

Jun 09 '23 20:06 foshesss

Switching to the gpt-3.5-turbo-16k-0613 model and then setting agent_type=AgentType.OPENAI_FUNCTIONS in the create_sql_agent method increases the token limit to 16k and gets better responses for me. Still not ideal but does help.

You need to include the following import for the AgentType enum:

from langchain.agents.agent_types import AgentType

Also note the AgentType.OPENAI_FUNCTIONS only worked well for me with the 0613 models. This behaviour is related to this documentation: https://python.langchain.com/docs/modules/agents/agent_types/openai_functions_agent I get a lot of parsing errors if I use that AgentType setting with the standard gpt-3.5-turbo model.

Jun 23 '23 04:06 GBBBAS

Am I missing something or do these limits make this functionally useless for most real world case? Unless you're looking at a very narrow focus for questions or have a tiny database schema then you're going to run into this limit. I've just been trying to get this working with the Microsoft sample database AdventureWorks which is not particularly big in terms of schema, certainly way smaller than any of our product db schemas, and even that is far too large for the 4K limit on the model I'm using. Switching to a model with a 16K limit would work in this particular example but it's still going to be too small for any reasonably sized schema. Unless you've got a narrow focus for questions you can't limit tables/columns as you don't know what's going to be asked. If you do know what is going to be asked then why even bother with AI?

I'm not trying to be antagonistic with this question - I'm genuinely curious what the application of this tool is with these limitations?

Jul 04 '23 08:07 rcarrington-lenvi

I'm trying to use the SQL agent at $WORK, and like others we're hitting the token limits. already trying to use gpt-3.5-turbo-16k but even that is not enough to handle the 34k tokens sometimes necessary to handle the results from SQL queries.

We're experimenting with the ability of the Plan & Execute agent to complete tasks, and using the SQL agent as a toolkit.get_tools() provider to the P&E agent. The task we're trying to accomplish is getting the agent to complete the 'SQL Murder Mystery' challenge.

The agent is currently not able to complete the challenge.

Jul 04 '23 14:07 medecau

I got it working for my use case, maybe the snippet includes enough hints for others to pick up from.

from typing import List, Optional

from langchain.agents.agent_toolkits import SQLDatabaseToolkit
from langchain.callbacks.manager import CallbackManagerForToolRun
from langchain.chat_models import ChatOpenAI
from langchain.experimental.plan_and_execute import (
    PlanAndExecute,
    load_agent_executor,
    load_chat_planner,
)
from langchain.sql_database import SQLDatabase
from langchain.text_splitter import TokenTextSplitter
from langchain.tools import BaseTool
from langchain.tools.sql_database.tool import QuerySQLDataBaseTool

model = ChatOpenAI(model="gpt-3.5-turbo-16k", temperature=0)
token_limit = 16_000

db = SQLDatabase.from_uri("sqlite:///my_database.db")


def split_text(text, chunk_size, chunk_overlap=0):
    text_splitter = TokenTextSplitter(
        chunk_size=chunk_size, chunk_overlap=chunk_overlap
    )
    yield from text_splitter.split_text(text)


class QuerySQLDatabaseTool2(QuerySQLDataBaseTool):
    def _run(
        self,
        query: str,
        run_manager: Optional[CallbackManagerForToolRun] = None,
    ) -> str:
        result = self.db.run_no_throw(query)

        return next(split_text(result, chunk_size=14_000))


class SQLDatabaseToolkit2(SQLDatabaseToolkit):
    def get_tools(self) -> List[BaseTool]:
        tools = super().get_tools()
        original_query_tool_description = tools[0].description
        new_query_tool = QuerySQLDatabaseTool2(
            db=self.db, description=original_query_tool_description
        )
        tools[0] = new_query_tool

        return tools


toolkit = SQLDatabaseToolkit2(db=db, llm=model)

planner = load_chat_planner(model)

executor = load_agent_executor(model, toolkit.get_tools(), verbose=True)

agent = PlanAndExecute(planner=planner, executor=executor, verbose=True)

agent.run(my_request)

You'll need tiktoken: pip install tiktoken or poetry add tiktoken.

Jul 05 '23 09:07 medecau

I got it working for my use case, maybe the snippet includes enough hints for others to pick up from.

from typing import List, Optional

from langchain.agents.agent_toolkits import SQLDatabaseToolkit
from langchain.callbacks.manager import CallbackManagerForToolRun
from langchain.chat_models import ChatOpenAI
from langchain.experimental.plan_and_execute import (
    PlanAndExecute,
    load_agent_executor,
    load_chat_planner,
)
from langchain.sql_database import SQLDatabase
from langchain.text_splitter import TokenTextSplitter
from langchain.tools import BaseTool
from langchain.tools.sql_database.tool import QuerySQLDataBaseTool

model = ChatOpenAI(model="gpt-3.5-turbo-16k", temperature=0)
token_limit = 16_000

db = SQLDatabase.from_uri("sqlite:///my_database.db")


def split_text(text, chunk_size, chunk_overlap=0):
    text_splitter = TokenTextSplitter(
        chunk_size=chunk_size, chunk_overlap=chunk_overlap
    )
    yield from text_splitter.split_text(text)


class QuerySQLDatabaseTool2(QuerySQLDataBaseTool):
    def _run(
        self,
        query: str,
        run_manager: Optional[CallbackManagerForToolRun] = None,
    ) -> str:
        result = self.db.run_no_throw(query)

        return next(split_text(result, chunk_size=14_000))


class SQLDatabaseToolkit2(SQLDatabaseToolkit):
    def get_tools(self) -> List[BaseTool]:
        tools = super().get_tools()
        original_query_tool_description = tools[0].description
        new_query_tool = QuerySQLDatabaseTool2(
            db=self.db, description=original_query_tool_description
        )
        tools[0] = new_query_tool

        return tools


toolkit = SQLDatabaseToolkit2(db=db, llm=model)

planner = load_chat_planner(model)

executor = load_agent_executor(model, toolkit.get_tools(), verbose=True)

agent = PlanAndExecute(planner=planner, executor=executor, verbose=True)

agent.run(my_request)

You'll need tiktoken: pip install tiktoken or poetry add tiktoken.

@medecau were this script able to solve 'murder mystery challenge' on its own? i tried this approach and does not hit max token limitation but deviates a lot from the original query.

Jul 11 '23 06:07 pumudu88

sorry, we do not have AGI here.

I tried with both gpt-3.5-turbo-16k and got-4 and both failed.

gpt-3.5-turbo-16k: pointed at three witnesses
got-4: went on for a bit having a monolog but did not make any effort to use the database – really, it was embarrassing

I wish these agents were a bit easier to tweak, so we could have a meta-agent for each of the steps of the plan-and-execute agent. As is these seem almost un-approachable.

PS: ChatGPT (GPT-4 + Code Interpreter) gets very close to resolving the sql-murder-mystery challenge. I had to guide it a bit towards using the right tools, completing the answer, and later to better explore one of the leads. See: https://chat.openai.com/share/52b550ce-90ae-4ded-9364-2e1bdbdfa302

Jul 12 '23 13:07 medecau

Hi, @jeyarajcs! I'm Dosu, and I'm here to help the LangChain team manage their backlog. I wanted to let you know that we are marking this issue as stale.

From what I understand, the issue you reported is about encountering a "Token Limit Exceeded Error" when using the create_sql_agent function in Snowflake. It seems that the error occurs when the table is larger and the question is more complex, exceeding the maximum token limit of the model. Some users have suggested adjusting parameters to limit the table information included in the prompt, using a vector db to pull out relevant columns before building the prompt, or using ChatGPT as a few-shot learner. Additionally, some users have switched to the gpt-3.5-turbo-16k-0613 model to increase the token limit.

Before we proceed, we would like to confirm if this issue is still relevant to the latest version of the LangChain repository. If it is, please let us know by commenting on this issue. Otherwise, feel free to close the issue yourself or it will be automatically closed in 7 days.

Thank you for your understanding and cooperation!

Oct 11 '23 16:10 dosubot[bot]

Has anyone yet found the solution to this issue?

Dec 12 '23 11:12 therajatg

Hi Will! Is there a method for tracking token usage as the program runs, specifically with a SQL agent? It's very difficult to identify the source of large token usage. I feel like if there was a way to monitor this, it would give me better insights when it comes to handling this problem.

Hey @foshesss, any updates on that? I am trying to track token usage with tiktoken, but haven't seen any real use cases with langchain agent.

Dec 26 '23 08:12 jamshid17

It's a shame this is still a very real issue. Has anyone looked into using MPT-7B's model to do this? 60k token count should be enough for normal use cases, right? I do not have a computer fast enough to run this model so im not sure how it handles creating SQL queries. Would love to hear from anyone who has accomplished this though!

Apr 22 '24 05:04 geeman11

@geeman11 in models with big context windows accuracy drops significantly, as the context size increases.

Apr 22 '24 11:04 LasalJayawardena

langchain langchain copied to clipboard

Issue: Token Limit Exceeded Error in SQL Database Agent

Issue you'd like to raise.

Suggestion:

langchain
langchain copied to clipboard