langchain
langchain copied to clipboard
Bad request: The following `model_kwargs` are not used by the model: ['return_full_text', 'stop', 'watermark', 'stop_sequences'] (note: typos in the generate arguments will also show up in this list)
Checked other resources
- [X] I added a very descriptive title to this issue.
- [X] I searched the LangChain documentation with the integrated search.
- [X] I used the GitHub search to find a similar question and didn't find it.
- [X] I am sure that this is a bug in LangChain rather than my code.
- [X] The bug is not resolved by updating to the latest stable version of LangChain (or the specific integration package).
Example Code
from langchain.prompts import PromptTemplate from langchain_community.llms.huggingface_endpoint import HuggingFaceEndpoint from langchain.chains import LLMChain
llm = HuggingFaceEndpoint( repo_id="google/flan-t5-large", temperature=0, max_new_tokens=250, huggingfacehub_api_token=HUGGINGFACE_TOKEN )
prompt_tpl = PromptTemplate( template="What is the good name for a company that makes {product}", input_variables=["product"] )
chain = LLMChain(llm=llm, prompt=prompt_tpl) print(chain.invoke("colorful socks"))
Error Message and Stack Trace (if applicable)
Traceback (most recent call last):
File "/Users/michaelchu/Documents/agent/agent.py", line 20, in
Bad request:
The following model_kwargs are not used by the model: ['return_full_text', 'stop', 'watermark', 'stop_sequences'] (note: typos in the generate arguments will also show up in this list)
Description
Hi, folks. I'm just trying to run a simple LLMChain and getting the Bad Request due to model_kwargs checking. I found there are several same issue being raised, however it haven't fixed in the latest release of langchain. Please help to take a look, thanks! Previous Issue being raised: https://github.com/langchain-ai/langchain/issues/10848
System Info
System Information
OS: Darwin OS Version: Darwin Kernel Version 23.2.0: Wed Nov 15 21:53:18 PST 2023; root:xnu-10002.61.3~2/RELEASE_ARM64_T6000 Python Version: 3.12.1 (main, Feb 14 2024, 09:50:51) [Clang 15.0.0 (clang-1500.1.0.2.5)]
Package Information
langchain_core: 0.1.27 langchain: 0.1.9 langchain_community: 0.0.24 langsmith: 0.1.10
Packages not installed (Not Necessarily a Problem)
The following packages were not found:
langgraph langserve
I'm facing the same problem (with various other models as well). I believe it could be caused by this commit on the huggingface_endpoint.py file. There's a new definition for _default_params that expects a base set of params for text-generation tasks. I'm guessing that if the model doesn't include those params - besides it throwing an error for models of different task types - it fails with the error you received.
I could be wrong, but that's where I'm at with my research on this issue.
` import os from langchain_community.llms.huggingface_endpoint import HuggingFaceEndpoint
os.environ["HUGGINGFACEHUB_API_TOKEN"] = "" huggingface_llm = HuggingFaceEndpoint( repo_id="google/flan-t5-small", temperature=0, max_new_tokens=250, ) huggingface_llm("Hello!") `
Bad request: The following model_kwargs are not used by the model: ['return_full_text', 'watermark', 'stop', 'stop_sequences'] (note: typos in the generate arguments will also show up in this list)
any help?
I downgraded my requirements on the langchain library for now, and I can use the endpoint class. It's just a workaround, but for reference, I'm using Python 3.11, and I've pinned the following package versions:
langchain==0.1.6
langchain-cli==0.0.21
langchain-openai==0.0.6
huggingface_hub==0.21.4
python-dotenv==1.0.0
pydantic==1.10.13
hf_model_id = "facebook/bart-large-cnn"
hf_endpoint_url = f"https://api-inference.huggingface.co/models/{hf_model_id}"
llm = HuggingFaceEndpoint(
task="summarization",
endpoint_url=hf_endpoint_url,
)
I am also facing this issue, and it appears that @nicole-wright is correct. It is specifically these lines: https://github.com/langchain-ai/langchain/blob/master/libs/community/langchain_community/llms/huggingface_endpoint.py#L199-L215 which inject default params into the call to the HuggingFace API regardless of whether they are present on the HuggingFaceEndpoint instance or not.
If a model/endpoint does not support these, it causes HuggingFace to throw back an error (even if the params are populated as None, e.g. {"stop_sequences": None}).
Perhaps a simple fix would be to remove None values from the invocation_params prior to posting to the client.
This is happening also when using local models. I'm trying the chain with local HuggingFace model "t5-flan"
I got errors of type:
The followingmodel_kwargs are not used by the model: ['return_full_text'] (note: typos in the generate arguments will also show up in this list)
llm = HuggingFacePipeline.from_model_id(model_id=flan, task="text2text-generation", model_kwargs={"temperature":1e-10
# ,"return_full_text":False
}, device=0)
template = PromptTemplate(input_variables=["input"], template="{input}")
chain = LLMChain(llm=llm, verbose=True, prompt=template)
chain("Say something?")
However, this is working when using local Huggingface gpt2 with "text-generation"
Did anyone solve this issue?
I'm fairly sure it's a bug and it needs a PR. I'm not sure I know the library enough to contribute but I could always give it a shot. For the prototyping I'm doing, downgrading to a previous release works and I can still access both the hub and the pipeline objects correctly.
I agree I also see the problem with various model as microsoft/phi-1_5 I hope for a fast bug fixing
I downgraded my requirements on the langchain library for now, and I can use the endpoint class. It's just a workaround, but for reference, I'm using Python 3.11, and I've pinned the following package versions:
langchain==0.1.6 langchain-cli==0.0.21 langchain-openai==0.0.6 huggingface_hub==0.21.4 python-dotenv==1.0.0 pydantic==1.10.13hf_model_id = "facebook/bart-large-cnn" hf_endpoint_url = f"https://api-inference.huggingface.co/models/{hf_model_id}" llm = HuggingFaceEndpoint( task="summarization", endpoint_url=hf_endpoint_url, )
I am trying to download the langchain-text-splitters library, but it is not suitable with these set of libraries, so I needed to upgrade the langchain library (but the main error kwargs shows again). Is there any help?? this is urgent please.
I have environment like
huggingface-hub==0.22.2
langchain==0.1.15
langchain-community==0.0.32
langchain-core==0.1.42
langchain-google-genai==1.0.2
langchain-text-splitters==0.0.1
langsmith==0.1.45
I have code as below:
from langchain_community.llms.huggingface_endpoint import HuggingFaceEndpoint
pipeline = HuggingFaceEndpoint(
huggingfacehub_api_token=os.getenv("HUGGINGFACE_API_KEY"),
repo_id="facebook/bart-large-cnn"
)
result = pipe.invoke(doc.page_content)
And I am facing error as below:
2024-04-12 16:04:22.422 Uncaught app exception
Traceback (most recent call last):
File "D:\Files\GitHub\RAG-for-pdf-search\venv\Lib\site-packages\huggingface_hub\utils\_errors.py", line 304, in hf_raise_for_status
response.raise_for_status()
File "D:\Files\GitHub\RAG-for-pdf-search\venv\Lib\site-packages\requests\models.py", line 1021, in raise_for_status
raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 400 Client Error: Bad Request for url: https://api-inference.huggingface.co/models/facebook/bart-large-cnn
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "D:\Files\GitHub\RAG-for-pdf-search\app.py", line 20, in <module>
result = pipe.invoke(doc.page_content)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\Files\GitHub\RAG-for-pdf-search\venv\Lib\site-packages\langchain_core\language_models\llms.py", line 276, in invoke
self.generate_prompt(
File "D:\Files\GitHub\RAG-for-pdf-search\venv\Lib\site-packages\langchain_core\language_models\llms.py", line 597, in generate_prompt
return self.generate(prompt_strings, stop=stop, callbacks=callbacks, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\Files\GitHub\RAG-for-pdf-search\venv\Lib\site-packages\langchain_core\language_models\llms.py", line 767, in generate
output = self._generate_helper(
^^^^^^^^^^^^^^^^^^^^^^
File "D:\Files\GitHub\RAG-for-pdf-search\venv\Lib\site-packages\langchain_core\language_models\llms.py", line 634, in _generate_helper
raise e
File "D:\Files\GitHub\RAG-for-pdf-search\venv\Lib\site-packages\langchain_core\language_models\llms.py", line 621, in _generate_helper
self._generate(
File "D:\Files\GitHub\RAG-for-pdf-search\venv\Lib\site-packages\langchain_core\language_models\llms.py", line 1231, in _generate
self._call(prompt, stop=stop, run_manager=run_manager, **kwargs)
File "D:\Files\GitHub\RAG-for-pdf-search\venv\Lib\site-packages\langchain_community\llms\huggingface_endpoint.py", line 256, in _call
response = self.client.post(
^^^^^^^^^^^^^^^^^
File "D:\Files\GitHub\RAG-for-pdf-search\venv\Lib\site-packages\huggingface_hub\inference\_client.py", line 267, in post
hf_raise_for_status(response)
File "D:\Files\GitHub\RAG-for-pdf-search\venv\Lib\site-packages\huggingface_hub\utils\_errors.py", line 358, in hf_raise_for_status
raise BadRequestError(message, response=response) from e
huggingface_hub.utils._errors.BadRequestError: (Request ID: 2mvdN9-F1BShz5K5FeYVk)
Bad request:
The following `model_kwargs` are not used by the model: ['stop_sequences', 'stop', 'watermark', 'return_full_text'] (note: typos in the generate arguments will also show up in this list)
I tried downgrading the langchain library, but it is causing issues with other packages (langchain-community etc.). What can I do?
Solution: Using HuggingFaceHub with LangChain
I found an effective way to use the HuggingFaceHub model with LangChain. Instead of the previous method, we can simplify and enhance the configuration as follows:
Previous Method
from langchain_huggingface.llms import HuggingFaceEndpoint
llm = HuggingFaceEndpoint(
repo_id="google/flan-t5-large",
temperature=0,
max_new_tokens=250,
huggingfacehub_api_token=HUGGINGFACE_API_TOKEN
)
Improved Method
This new method utilizes the HuggingFaceHub from LangChain with more detailed model configurations.
from langchain import HuggingFaceHub
llm = HuggingFaceHub(
repo_id='google/flan-t5-base',
model_kwargs={"temperature":0, "max_length":180, 'max_new_tokens' : 120, 'top_k' : 10, 'top_p': 0.95, 'repetition_penalty':1.03}
)
# using the model
output = llm.invoke('Can you tell me the capital of russia')
print(output)
@aymeric-roucher , @baskaryan - could provide a bug fix please, your commit 0d294760e742e0707a71afc7aad22e4d00b54ae5 breaks LangChain (see bug report above)!
Hey, I'm also getting the same issue as mentioned above. I tried downgrading the packages with no success is there any other workaround?
Same problem here, reverted back to HuggingFaceHub and enduring the deprecation warnings. Hopefully the bug gets fixed before the workaround is unsupported.
For those encountering this issue, the problem arises because Langchain tries to send parameters that certain models on the Huggingface client don't support, specifically: stop, watermark, return_full_text, and stop_sequences.
I made a quick wrapper workaround to address this, and I’ll try to submit a PR soon:
from langchain_core.outputs import Generation, GenerationChunk, LLMResult, RunInfo
import json
class MyHuggingFaceEndpoint(HuggingFaceEndpoint):
def generate_prompt(
self,
prompts,
stop = None,
callbacks = None,
**kwargs,
):
prompt_strings = [p.to_string() for p in prompts]
return self.generate(prompt_strings, stop=stop, callbacks=callbacks, **kwargs)
def _generate_helper(
self,
prompts,
stop,
run_managers,
new_arg_supported,
**kwargs,
):
try:
output = (
self._generate(
prompts,
stop=stop,
# TODO: support multiple run managers
run_manager=run_managers[0] if run_managers else None,
**kwargs,
)
if new_arg_supported
else self._generate(prompts, stop=stop)
)
except BaseException as e:
for run_manager in run_managers:
run_manager.on_llm_error(e, response=LLMResult(generations=[]))
raise e
flattened_outputs = output.flatten()
for manager, flattened_output in zip(run_managers, flattened_outputs):
manager.on_llm_end(flattened_output)
if run_managers:
output.run = [
RunInfo(run_id=run_manager.run_id) for run_manager in run_managers
]
return output
def _call(
self,
prompt: str,
stop = None,
run_manager= None,
**kwargs,
) -> str:
"""Call out to HuggingFace Hub's inference endpoint."""
invocation_params = self._invocation_params(stop, **kwargs)
if self.streaming:
completion = ""
for chunk in self._stream(prompt, stop, run_manager, **invocation_params):
completion += chunk.text
return completion
else:
invocation_params["stop"] = invocation_params[
"stop_sequences"
] # porting 'stop_sequences' into the 'stop' argument
response = self.client.post(
json={"inputs": prompt, "parameters": invocation_params},
stream=False,
task=self.task,
)
try:
response_text = json.loads(response.decode())[0]["generated_text"]
except KeyError:
response_text = json.loads(response.decode())["generated_text"]
# Maybe the generation has stopped at one of the stop sequences:
# then we remove this stop sequence from the end of the generated text
if invocation_params["stop_sequences"]:
for stop_seq in invocation_params["stop_sequences"]:
if response_text[-len(stop_seq) :] == stop_seq:
response_text = response_text[: -len(stop_seq)]
return response_text
def _invocation_params(
self, runtime_stop, **kwargs
):
params = {**self._default_params, **kwargs}
if isinstance(params["stop_sequences"], list):
params["stop_sequences"] = params["stop_sequences"] + (runtime_stop or [])
return params
and how you call it:
from langchain import LLMChain
from langchain_huggingface.llms.huggingface_endpoint import HuggingFaceEndpoint
# initialize Hub LLM
llm = MyHuggingFaceEndpoint(
max_new_tokens=250,
repo_id='google/flan-t5-large',
temperature=0,
repetition_penalty=1.03,
huggingfacehub_api_token=<YOUR HF KEY >,
)
print(llm.invoke("What is Deep Learning?", stop=None, watermark=None, return_full_text=None, stop_sequences=None))
This should work. The reason for using the wrapper is that you can't simply set the parameters to None in the invoke. Doing so causes an unsupported operation error because stop_sequences is being concatenated with an empty list at some point.
Solution: Using
HuggingFaceHubwith LangChainI found an effective way to use the
HuggingFaceHubmodel with LangChain. Instead of the previous method, we can simplify and enhance the configuration as follows:Previous Method
from langchain_huggingface.llms import HuggingFaceEndpoint llm = HuggingFaceEndpoint( repo_id="google/flan-t5-large", temperature=0, max_new_tokens=250, huggingfacehub_api_token=HUGGINGFACE_API_TOKEN )Improved Method
This new method utilizes the HuggingFaceHub from LangChain with more detailed model configurations.
from langchain import HuggingFaceHub llm = HuggingFaceHub( repo_id='google/flan-t5-base', model_kwargs={"temperature":0, "max_length":180, 'max_new_tokens' : 120, 'top_k' : 10, 'top_p': 0.95, 'repetition_penalty':1.03} ) # using the model output = llm.invoke('Can you tell me the capital of russia') print(output)
This works for me
Hi, @michaelCHU95. I'm Dosu, and I'm helping the LangChain team manage their backlog. I'm marking this issue as stale.
Issue Summary:
- You reported a bug where certain
model_kwargsare not utilized, causing a "Bad request" error. - Users identified a recent commit as a potential cause, with default parameters being incorrectly injected.
- Temporary resolutions include downgrading LangChain or using a custom wrapper.
- An alternative method using
HuggingFaceHubhas been shared and worked for some users.
Next Steps:
- Please confirm if this issue is still relevant with the latest version of LangChain. If so, you can keep the discussion open by commenting here.
- If there is no further activity, this issue will be automatically closed in 7 days.
Thank you for your understanding and contribution!
This issue is still there.