langchain
langchain copied to clipboard
Issue: Token Limit Exceeded Error in SQL Database Agent
Issue you'd like to raise.
I am trying to work with Snowflake using create_sql_agent
Very often getting token limit error.
This is my code
from langchain.agents import create_sql_agent
from langchain.agents.agent_toolkits import SQLDatabaseToolkit
from langchain.sql_database import SQLDatabase
from langchain.llms.openai import OpenAI
from langchain.agents import AgentExecutor
from sqlalchemy.dialects import registry
registry.load("snowflake")
account_identifier = 'xxxx'
user = 'xxxx'
password = 'xxxx'
database_name = 'xxxx'
schema_name = 'xxxx'
warehouse_name = 'xxxx'
role_name = 'xxxx'
conn_string = f"snowflake://{user}:{password}@{account_identifier}/{database_name}/{schema_name}?warehouse={warehouse_name}&role={role_name}"
db = SQLDatabase.from_uri(conn_string)
print("DB===", db)
toolkit = SQLDatabaseToolkit(llm=OpenAI(temperature=0), db=db)
agent_executor = create_sql_agent(
llm=OpenAI(temperature=0),
toolkit=toolkit,
verbose=True
)
agent_executor.run("Which companies are getting the most reviews in a specific category?")
If I ask straightforward question on a tiny table that has only 5 records, Then the agent is running well.
If the table is slightly bigger with complex question, It throws InvalidRequestError: This model's maximum context length is 4097 tokens, however you requested 13719 tokens (13463 in your prompt; 256 for the completion). Please reduce your prompt; or completion length.
Suggestion:
No response
I think the issue is from SQLDatabaseToolkit
. This schema_sql_db
tool takes comma-separated list of tables as input, output is the schema and sample rows for those tables. If the query is a bit lengthy or complex, then this tool returns pretty much all the table schema with sample rows. I think that's where we hit the token limit
error.
But I don't know How to deal with this issue?
I am also facing the same issue
@hwchase17 Any thoughts? Any workaround to pass this?
There are a couple of parameters you can adjust to limit the table information that's going to be included in the prompt.
SQLDatabase.from_uri(
conn_string,
sample_rows_in_table_info=1, # default 3
include_tables=["my_table_1"], # will restrict to just using my_table_1
)
You can also use ignore_tables
to limit the tables to include in the table info.
I am facing the same issue, I have included two tables. sample_rows_in_table_info=1, # default 3 include_tables=['tblSales', 'tblCustomer'] The customer table contains 125 columns The Sales table contains 102 columns.
Question: List top 10 customers based on total sales in this year. This will link both tables and thus fields of both tables will be sent to OpenAI.
openai.error.InvalidRequestError: This model's maximum context length is 4097 tokens, however you requested 7603 tokens (7347 in your prompt; 256 for the completion). Please reduce your prompt; or completion length.
I have the same issue. I have tables that have 100+ columns and the prompts being generated are just too large.
I might be crazy, but I am thinking about putting a vector db in front of this and then having the query first pull out the columns it thinks are relevant before building up the prompt.
The agent has to query the list of tables on the database, then the schema for the most relevant tables. If you have lots of relevant sounding tables, or big schemas for some tables, you hit the limit of the prompt length.
Immediate options are:
- decrease the columns on your tables
- try to get access to ChatGPT4 with 8k prompts instead of 4k.
- create a view onto your table with just a few columns, then maybe adapt the langchain code if necessary to make the agent consider the view rather than the table.
- Give up using langchain and use ChatGPT as a few-shot learner : give it example(s) of SQL queries that do work on your tables to answer various natural language queries, and then see whether it does a better job of building new SQL.
Hi Will! Is there a method for tracking token usage as the program runs, specifically with a SQL agent? It's very difficult to identify the source of large token usage. I feel like if there was a way to monitor this, it would give me better insights when it comes to handling this problem.
Switching to the gpt-3.5-turbo-16k-0613
model and then setting agent_type=AgentType.OPENAI_FUNCTIONS
in the create_sql_agent
method increases the token limit to 16k and gets better responses for me. Still not ideal but does help.
You need to include the following import for the AgentType enum:
from langchain.agents.agent_types import AgentType
Also note the AgentType.OPENAI_FUNCTIONS
only worked well for me with the 0613
models. This behaviour is related to this documentation: https://python.langchain.com/docs/modules/agents/agent_types/openai_functions_agent
I get a lot of parsing errors if I use that AgentType setting with the standard gpt-3.5-turbo model.
Am I missing something or do these limits make this functionally useless for most real world case? Unless you're looking at a very narrow focus for questions or have a tiny database schema then you're going to run into this limit. I've just been trying to get this working with the Microsoft sample database AdventureWorks which is not particularly big in terms of schema, certainly way smaller than any of our product db schemas, and even that is far too large for the 4K limit on the model I'm using. Switching to a model with a 16K limit would work in this particular example but it's still going to be too small for any reasonably sized schema. Unless you've got a narrow focus for questions you can't limit tables/columns as you don't know what's going to be asked. If you do know what is going to be asked then why even bother with AI?
I'm not trying to be antagonistic with this question - I'm genuinely curious what the application of this tool is with these limitations?
I'm trying to use the SQL agent at $WORK, and like others we're hitting the token limits. already trying to use gpt-3.5-turbo-16k
but even that is not enough to handle the 34k tokens sometimes necessary to handle the results from SQL queries.
We're experimenting with the ability of the Plan & Execute agent to complete tasks, and using the SQL agent as a toolkit.get_tools()
provider to the P&E agent.
The task we're trying to accomplish is getting the agent to complete the 'SQL Murder Mystery' challenge.
The agent is currently not able to complete the challenge.
I got it working for my use case, maybe the snippet includes enough hints for others to pick up from.
from typing import List, Optional
from langchain.agents.agent_toolkits import SQLDatabaseToolkit
from langchain.callbacks.manager import CallbackManagerForToolRun
from langchain.chat_models import ChatOpenAI
from langchain.experimental.plan_and_execute import (
PlanAndExecute,
load_agent_executor,
load_chat_planner,
)
from langchain.sql_database import SQLDatabase
from langchain.text_splitter import TokenTextSplitter
from langchain.tools import BaseTool
from langchain.tools.sql_database.tool import QuerySQLDataBaseTool
model = ChatOpenAI(model="gpt-3.5-turbo-16k", temperature=0)
token_limit = 16_000
db = SQLDatabase.from_uri("sqlite:///my_database.db")
def split_text(text, chunk_size, chunk_overlap=0):
text_splitter = TokenTextSplitter(
chunk_size=chunk_size, chunk_overlap=chunk_overlap
)
yield from text_splitter.split_text(text)
class QuerySQLDatabaseTool2(QuerySQLDataBaseTool):
def _run(
self,
query: str,
run_manager: Optional[CallbackManagerForToolRun] = None,
) -> str:
result = self.db.run_no_throw(query)
return next(split_text(result, chunk_size=14_000))
class SQLDatabaseToolkit2(SQLDatabaseToolkit):
def get_tools(self) -> List[BaseTool]:
tools = super().get_tools()
original_query_tool_description = tools[0].description
new_query_tool = QuerySQLDatabaseTool2(
db=self.db, description=original_query_tool_description
)
tools[0] = new_query_tool
return tools
toolkit = SQLDatabaseToolkit2(db=db, llm=model)
planner = load_chat_planner(model)
executor = load_agent_executor(model, toolkit.get_tools(), verbose=True)
agent = PlanAndExecute(planner=planner, executor=executor, verbose=True)
agent.run(my_request)
You'll need tiktoken
: pip install tiktoken
or poetry add tiktoken
.
I got it working for my use case, maybe the snippet includes enough hints for others to pick up from.
from typing import List, Optional from langchain.agents.agent_toolkits import SQLDatabaseToolkit from langchain.callbacks.manager import CallbackManagerForToolRun from langchain.chat_models import ChatOpenAI from langchain.experimental.plan_and_execute import ( PlanAndExecute, load_agent_executor, load_chat_planner, ) from langchain.sql_database import SQLDatabase from langchain.text_splitter import TokenTextSplitter from langchain.tools import BaseTool from langchain.tools.sql_database.tool import QuerySQLDataBaseTool model = ChatOpenAI(model="gpt-3.5-turbo-16k", temperature=0) token_limit = 16_000 db = SQLDatabase.from_uri("sqlite:///my_database.db") def split_text(text, chunk_size, chunk_overlap=0): text_splitter = TokenTextSplitter( chunk_size=chunk_size, chunk_overlap=chunk_overlap ) yield from text_splitter.split_text(text) class QuerySQLDatabaseTool2(QuerySQLDataBaseTool): def _run( self, query: str, run_manager: Optional[CallbackManagerForToolRun] = None, ) -> str: result = self.db.run_no_throw(query) return next(split_text(result, chunk_size=14_000)) class SQLDatabaseToolkit2(SQLDatabaseToolkit): def get_tools(self) -> List[BaseTool]: tools = super().get_tools() original_query_tool_description = tools[0].description new_query_tool = QuerySQLDatabaseTool2( db=self.db, description=original_query_tool_description ) tools[0] = new_query_tool return tools toolkit = SQLDatabaseToolkit2(db=db, llm=model) planner = load_chat_planner(model) executor = load_agent_executor(model, toolkit.get_tools(), verbose=True) agent = PlanAndExecute(planner=planner, executor=executor, verbose=True) agent.run(my_request)
You'll need
tiktoken
:pip install tiktoken
orpoetry add tiktoken
.
@medecau were this script able to solve 'murder mystery challenge' on its own? i tried this approach and does not hit max token limitation but deviates a lot from the original query.
sorry, we do not have AGI here.
I tried with both gpt-3.5-turbo-16k
and got-4
and both failed.
-
gpt-3.5-turbo-16k
: pointed at three witnesses -
got-4
: went on for a bit having a monolog but did not make any effort to use the database – really, it was embarrassing
I wish these agents were a bit easier to tweak, so we could have a meta-agent for each of the steps of the plan-and-execute agent. As is these seem almost un-approachable.
PS: ChatGPT (GPT-4 + Code Interpreter) gets very close to resolving the sql-murder-mystery challenge. I had to guide it a bit towards using the right tools, completing the answer, and later to better explore one of the leads. See: https://chat.openai.com/share/52b550ce-90ae-4ded-9364-2e1bdbdfa302
Hi, @jeyarajcs! I'm Dosu, and I'm here to help the LangChain team manage their backlog. I wanted to let you know that we are marking this issue as stale.
From what I understand, the issue you reported is about encountering a "Token Limit Exceeded Error" when using the create_sql_agent
function in Snowflake. It seems that the error occurs when the table is larger and the question is more complex, exceeding the maximum token limit of the model. Some users have suggested adjusting parameters to limit the table information included in the prompt, using a vector db to pull out relevant columns before building the prompt, or using ChatGPT as a few-shot learner. Additionally, some users have switched to the gpt-3.5-turbo-16k-0613
model to increase the token limit.
Before we proceed, we would like to confirm if this issue is still relevant to the latest version of the LangChain repository. If it is, please let us know by commenting on this issue. Otherwise, feel free to close the issue yourself or it will be automatically closed in 7 days.
Thank you for your understanding and cooperation!
Has anyone yet found the solution to this issue?
Hi Will! Is there a method for tracking token usage as the program runs, specifically with a SQL agent? It's very difficult to identify the source of large token usage. I feel like if there was a way to monitor this, it would give me better insights when it comes to handling this problem.
Hey @foshesss, any updates on that? I am trying to track token usage with tiktoken, but haven't seen any real use cases with langchain agent.
It's a shame this is still a very real issue. Has anyone looked into using MPT-7B's model to do this? 60k token count should be enough for normal use cases, right? I do not have a computer fast enough to run this model so im not sure how it handles creating SQL queries. Would love to hear from anyone who has accomplished this though!
@geeman11 in models with big context windows accuracy drops significantly, as the context size increases.