langchain icon indicating copy to clipboard operation
langchain copied to clipboard

Bad request: The following `model_kwargs` are not used by the model: ['return_full_text', 'stop', 'watermark', 'stop_sequences'] (note: typos in the generate arguments will also show up in this list)

Open michaelCHU95 opened this issue 1 year ago • 17 comments

Checked other resources

  • [X] I added a very descriptive title to this issue.
  • [X] I searched the LangChain documentation with the integrated search.
  • [X] I used the GitHub search to find a similar question and didn't find it.
  • [X] I am sure that this is a bug in LangChain rather than my code.
  • [X] The bug is not resolved by updating to the latest stable version of LangChain (or the specific integration package).

Example Code

from langchain.prompts import PromptTemplate from langchain_community.llms.huggingface_endpoint import HuggingFaceEndpoint from langchain.chains import LLMChain

llm = HuggingFaceEndpoint( repo_id="google/flan-t5-large", temperature=0, max_new_tokens=250, huggingfacehub_api_token=HUGGINGFACE_TOKEN )

prompt_tpl = PromptTemplate( template="What is the good name for a company that makes {product}", input_variables=["product"] )

chain = LLMChain(llm=llm, prompt=prompt_tpl) print(chain.invoke("colorful socks"))

Error Message and Stack Trace (if applicable)

Traceback (most recent call last): File "/Users/michaelchu/Documents/agent/agent.py", line 20, in print(chain.invoke("colorful socks")) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/michaelchu/Documents/agent/venv/lib/python3.12/site-packages/langchain/chains/base.py", line 163, in invoke raise e File "/Users/michaelchu/Documents/agent/venv/lib/python3.12/site-packages/langchain/chains/base.py", line 153, in invoke self._call(inputs, run_manager=run_manager) File "/Users/michaelchu/Documents/agent/venv/lib/python3.12/site-packages/langchain/chains/llm.py", line 103, in _call response = self.generate([inputs], run_manager=run_manager) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/michaelchu/Documents/agent/venv/lib/python3.12/site-packages/langchain/chains/llm.py", line 115, in generate return self.llm.generate_prompt( ^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/michaelchu/Documents/agent/venv/lib/python3.12/site-packages/langchain_core/language_models/llms.py", line 568, in generate_prompt return self.generate(prompt_strings, stop=stop, callbacks=callbacks, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/michaelchu/Documents/agent/venv/lib/python3.12/site-packages/langchain_core/language_models/llms.py", line 741, in generate output = self._generate_helper( ^^^^^^^^^^^^^^^^^^^^^^ File "/Users/michaelchu/Documents/agent/venv/lib/python3.12/site-packages/langchain_core/language_models/llms.py", line 605, in _generate_helper raise e File "/Users/michaelchu/Documents/agent/venv/lib/python3.12/site-packages/langchain_core/language_models/llms.py", line 592, in _generate_helper self._generate( File "/Users/michaelchu/Documents/agent/venv/lib/python3.12/site-packages/langchain_core/language_models/llms.py", line 1177, in _generate self._call(prompt, stop=stop, run_manager=run_manager, **kwargs) File "/Users/michaelchu/Documents/agent/venv/lib/python3.12/site-packages/langchain_community/llms/huggingface_endpoint.py", line 256, in _call response = self.client.post( ^^^^^^^^^^^^^^^^^ File "/Users/michaelchu/Documents/agent/venv/lib/python3.12/site-packages/huggingface_hub/inference/_client.py", line 242, in post hf_raise_for_status(response) File "/Users/michaelchu/Documents/agent/venv/lib/python3.12/site-packages/huggingface_hub/utils/_errors.py", line 358, in hf_raise_for_status raise BadRequestError(message, response=response) from e huggingface_hub.utils._errors.BadRequestError: (Request ID: AxsbrX3A4JxXuBdYC7fv-)

Bad request: The following model_kwargs are not used by the model: ['return_full_text', 'stop', 'watermark', 'stop_sequences'] (note: typos in the generate arguments will also show up in this list)

Description

Hi, folks. I'm just trying to run a simple LLMChain and getting the Bad Request due to model_kwargs checking. I found there are several same issue being raised, however it haven't fixed in the latest release of langchain. Please help to take a look, thanks! Previous Issue being raised: https://github.com/langchain-ai/langchain/issues/10848

System Info

System Information

OS: Darwin OS Version: Darwin Kernel Version 23.2.0: Wed Nov 15 21:53:18 PST 2023; root:xnu-10002.61.3~2/RELEASE_ARM64_T6000 Python Version: 3.12.1 (main, Feb 14 2024, 09:50:51) [Clang 15.0.0 (clang-1500.1.0.2.5)]

Package Information

langchain_core: 0.1.27 langchain: 0.1.9 langchain_community: 0.0.24 langsmith: 0.1.10

Packages not installed (Not Necessarily a Problem)

The following packages were not found:

langgraph langserve

michaelCHU95 avatar Feb 29 '24 13:02 michaelCHU95

I'm facing the same problem (with various other models as well). I believe it could be caused by this commit on the huggingface_endpoint.py file. There's a new definition for _default_params that expects a base set of params for text-generation tasks. I'm guessing that if the model doesn't include those params - besides it throwing an error for models of different task types - it fails with the error you received. I could be wrong, but that's where I'm at with my research on this issue.

nicole-wright avatar Mar 06 '24 17:03 nicole-wright

` import os from langchain_community.llms.huggingface_endpoint import HuggingFaceEndpoint

os.environ["HUGGINGFACEHUB_API_TOKEN"] = "" huggingface_llm = HuggingFaceEndpoint( repo_id="google/flan-t5-small", temperature=0, max_new_tokens=250, ) huggingface_llm("Hello!") `

Bad request: The following model_kwargs are not used by the model: ['return_full_text', 'watermark', 'stop', 'stop_sequences'] (note: typos in the generate arguments will also show up in this list)

any help?

joelorellana avatar Mar 07 '24 14:03 joelorellana

I downgraded my requirements on the langchain library for now, and I can use the endpoint class. It's just a workaround, but for reference, I'm using Python 3.11, and I've pinned the following package versions:

langchain==0.1.6
langchain-cli==0.0.21
langchain-openai==0.0.6
huggingface_hub==0.21.4
python-dotenv==1.0.0
pydantic==1.10.13
hf_model_id = "facebook/bart-large-cnn"
hf_endpoint_url = f"https://api-inference.huggingface.co/models/{hf_model_id}"

llm = HuggingFaceEndpoint(
    task="summarization",
    endpoint_url=hf_endpoint_url,
)

nicole-wright avatar Mar 07 '24 14:03 nicole-wright

I am also facing this issue, and it appears that @nicole-wright is correct. It is specifically these lines: https://github.com/langchain-ai/langchain/blob/master/libs/community/langchain_community/llms/huggingface_endpoint.py#L199-L215 which inject default params into the call to the HuggingFace API regardless of whether they are present on the HuggingFaceEndpoint instance or not.

If a model/endpoint does not support these, it causes HuggingFace to throw back an error (even if the params are populated as None, e.g. {"stop_sequences": None}).

Perhaps a simple fix would be to remove None values from the invocation_params prior to posting to the client.

jamesnixon-aws avatar Mar 07 '24 14:03 jamesnixon-aws

This is happening also when using local models. I'm trying the chain with local HuggingFace model "t5-flan"

I got errors of type: The followingmodel_kwargs are not used by the model: ['return_full_text'] (note: typos in the generate arguments will also show up in this list)

llm = HuggingFacePipeline.from_model_id(model_id=flan, task="text2text-generation", model_kwargs={"temperature":1e-10
                                                                                                #    ,"return_full_text":False
                                                                                                   }, device=0)
template = PromptTemplate(input_variables=["input"], template="{input}")
chain = LLMChain(llm=llm, verbose=True, prompt=template)
chain("Say something?")

However, this is working when using local Huggingface gpt2 with "text-generation"

Laatcha avatar Mar 11 '24 15:03 Laatcha

Did anyone solve this issue?

dongkyunlim77 avatar Mar 22 '24 07:03 dongkyunlim77

I'm fairly sure it's a bug and it needs a PR. I'm not sure I know the library enough to contribute but I could always give it a shot. For the prototyping I'm doing, downgrading to a previous release works and I can still access both the hub and the pipeline objects correctly.

nicole-wright avatar Mar 22 '24 17:03 nicole-wright

I agree I also see the problem with various model as microsoft/phi-1_5 I hope for a fast bug fixing

AccentureGabriv93 avatar Apr 08 '24 13:04 AccentureGabriv93

I downgraded my requirements on the langchain library for now, and I can use the endpoint class. It's just a workaround, but for reference, I'm using Python 3.11, and I've pinned the following package versions:

langchain==0.1.6
langchain-cli==0.0.21
langchain-openai==0.0.6
huggingface_hub==0.21.4
python-dotenv==1.0.0
pydantic==1.10.13
hf_model_id = "facebook/bart-large-cnn"
hf_endpoint_url = f"https://api-inference.huggingface.co/models/{hf_model_id}"

llm = HuggingFaceEndpoint(
    task="summarization",
    endpoint_url=hf_endpoint_url,
)

I am trying to download the langchain-text-splitters library, but it is not suitable with these set of libraries, so I needed to upgrade the langchain library (but the main error kwargs shows again). Is there any help?? this is urgent please.

saad-shahrour avatar Apr 11 '24 15:04 saad-shahrour

I have environment like

huggingface-hub==0.22.2
langchain==0.1.15
langchain-community==0.0.32
langchain-core==0.1.42
langchain-google-genai==1.0.2
langchain-text-splitters==0.0.1
langsmith==0.1.45

I have code as below:

from langchain_community.llms.huggingface_endpoint import HuggingFaceEndpoint

pipeline = HuggingFaceEndpoint(
        huggingfacehub_api_token=os.getenv("HUGGINGFACE_API_KEY"),
        repo_id="facebook/bart-large-cnn"
)

result = pipe.invoke(doc.page_content)

And I am facing error as below:

2024-04-12 16:04:22.422 Uncaught app exception
Traceback (most recent call last):
  File "D:\Files\GitHub\RAG-for-pdf-search\venv\Lib\site-packages\huggingface_hub\utils\_errors.py", line 304, in hf_raise_for_status
    response.raise_for_status()
  File "D:\Files\GitHub\RAG-for-pdf-search\venv\Lib\site-packages\requests\models.py", line 1021, in raise_for_status        
    raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 400 Client Error: Bad Request for url: https://api-inference.huggingface.co/models/facebook/bart-large-cnn

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "D:\Files\GitHub\RAG-for-pdf-search\app.py", line 20, in <module>
    result = pipe.invoke(doc.page_content)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\Files\GitHub\RAG-for-pdf-search\venv\Lib\site-packages\langchain_core\language_models\llms.py", line 276, in invoke
    self.generate_prompt(
  File "D:\Files\GitHub\RAG-for-pdf-search\venv\Lib\site-packages\langchain_core\language_models\llms.py", line 597, in generate_prompt
    return self.generate(prompt_strings, stop=stop, callbacks=callbacks, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\Files\GitHub\RAG-for-pdf-search\venv\Lib\site-packages\langchain_core\language_models\llms.py", line 767, in generate
    output = self._generate_helper(
             ^^^^^^^^^^^^^^^^^^^^^^
  File "D:\Files\GitHub\RAG-for-pdf-search\venv\Lib\site-packages\langchain_core\language_models\llms.py", line 634, in _generate_helper
    raise e
  File "D:\Files\GitHub\RAG-for-pdf-search\venv\Lib\site-packages\langchain_core\language_models\llms.py", line 621, in _generate_helper
    self._generate(
  File "D:\Files\GitHub\RAG-for-pdf-search\venv\Lib\site-packages\langchain_core\language_models\llms.py", line 1231, in _generate
    self._call(prompt, stop=stop, run_manager=run_manager, **kwargs)
  File "D:\Files\GitHub\RAG-for-pdf-search\venv\Lib\site-packages\langchain_community\llms\huggingface_endpoint.py", line 256, in _call
    response = self.client.post(
               ^^^^^^^^^^^^^^^^^
  File "D:\Files\GitHub\RAG-for-pdf-search\venv\Lib\site-packages\huggingface_hub\inference\_client.py", line 267, in post   
    hf_raise_for_status(response)
  File "D:\Files\GitHub\RAG-for-pdf-search\venv\Lib\site-packages\huggingface_hub\utils\_errors.py", line 358, in hf_raise_for_status
    raise BadRequestError(message, response=response) from e
huggingface_hub.utils._errors.BadRequestError:  (Request ID: 2mvdN9-F1BShz5K5FeYVk)

Bad request:
The following `model_kwargs` are not used by the model: ['stop_sequences', 'stop', 'watermark', 'return_full_text'] (note: typos in the generate arguments will also show up in this list)

I tried downgrading the langchain library, but it is causing issues with other packages (langchain-community etc.). What can I do?

NotShrirang avatar Apr 12 '24 10:04 NotShrirang

Solution: Using HuggingFaceHub with LangChain

I found an effective way to use the HuggingFaceHub model with LangChain. Instead of the previous method, we can simplify and enhance the configuration as follows:

Previous Method

from langchain_huggingface.llms import HuggingFaceEndpoint 

llm = HuggingFaceEndpoint(
    repo_id="google/flan-t5-large",
    temperature=0,
    max_new_tokens=250,
    huggingfacehub_api_token=HUGGINGFACE_API_TOKEN
)

Improved Method

This new method utilizes the HuggingFaceHub from LangChain with more detailed model configurations.

from langchain import HuggingFaceHub

llm = HuggingFaceHub(
    repo_id='google/flan-t5-base',
    model_kwargs={"temperature":0, "max_length":180, 'max_new_tokens' : 120, 'top_k' : 10, 'top_p': 0.95, 'repetition_penalty':1.03}
)

# using the model 
output = llm.invoke('Can you tell me the capital of russia')

print(output)

anujsahani01 avatar Jun 08 '24 11:06 anujsahani01

@aymeric-roucher , @baskaryan - could provide a bug fix please, your commit 0d294760e742e0707a71afc7aad22e4d00b54ae5 breaks LangChain (see bug report above)!

olk avatar Jun 12 '24 04:06 olk

Hey, I'm also getting the same issue as mentioned above. I tried downgrading the packages with no success is there any other workaround?

dordonezc avatar Jun 16 '24 23:06 dordonezc

Same problem here, reverted back to HuggingFaceHub and enduring the deprecation warnings. Hopefully the bug gets fixed before the workaround is unsupported.

jodyhuntatx avatar Jul 16 '24 13:07 jodyhuntatx

For those encountering this issue, the problem arises because Langchain tries to send parameters that certain models on the Huggingface client don't support, specifically: stop, watermark, return_full_text, and stop_sequences.

I made a quick wrapper workaround to address this, and I’ll try to submit a PR soon:

from langchain_core.outputs import Generation, GenerationChunk, LLMResult, RunInfo
import json

class MyHuggingFaceEndpoint(HuggingFaceEndpoint):
    def generate_prompt(
        self,
        prompts,
        stop = None,
        callbacks = None,
        **kwargs,
    ):
        prompt_strings = [p.to_string() for p in prompts]
        return self.generate(prompt_strings, stop=stop, callbacks=callbacks, **kwargs)

    def _generate_helper(
        self,
        prompts,
        stop,
        run_managers,
        new_arg_supported,
        **kwargs,
    ):
        try:
            output = (
                self._generate(
                    prompts,
                    stop=stop,
                    # TODO: support multiple run managers
                    run_manager=run_managers[0] if run_managers else None,
                    **kwargs,
                )
                if new_arg_supported
                else self._generate(prompts, stop=stop)
            )
        except BaseException as e:
            for run_manager in run_managers:
                run_manager.on_llm_error(e, response=LLMResult(generations=[]))
            raise e
        flattened_outputs = output.flatten()
        for manager, flattened_output in zip(run_managers, flattened_outputs):
            manager.on_llm_end(flattened_output)
        if run_managers:
            output.run = [
                RunInfo(run_id=run_manager.run_id) for run_manager in run_managers
            ]
        return output
    def _call(
        self,
        prompt: str,
        stop = None,
        run_manager= None,
        **kwargs,
    ) -> str:
        """Call out to HuggingFace Hub's inference endpoint."""
        invocation_params = self._invocation_params(stop, **kwargs)
        if self.streaming:
            completion = ""
            for chunk in self._stream(prompt, stop, run_manager, **invocation_params):
                completion += chunk.text
            return completion
        else:
            invocation_params["stop"] = invocation_params[
                "stop_sequences"
            ]  # porting 'stop_sequences' into the 'stop' argument
            response = self.client.post(
                json={"inputs": prompt, "parameters": invocation_params},
                stream=False,
                task=self.task,
            )
            try:
                response_text = json.loads(response.decode())[0]["generated_text"]
            except KeyError:
                response_text = json.loads(response.decode())["generated_text"]

            # Maybe the generation has stopped at one of the stop sequences:
            # then we remove this stop sequence from the end of the generated text
            if invocation_params["stop_sequences"]:
              for stop_seq in invocation_params["stop_sequences"]:
                  if response_text[-len(stop_seq) :] == stop_seq:
                      response_text = response_text[: -len(stop_seq)]
            return response_text


  
    def _invocation_params(
        self, runtime_stop, **kwargs
    ):
        params = {**self._default_params, **kwargs}
        if isinstance(params["stop_sequences"], list):
            params["stop_sequences"] = params["stop_sequences"] + (runtime_stop or [])
        return params

and how you call it:

from langchain import LLMChain
from langchain_huggingface.llms.huggingface_endpoint import HuggingFaceEndpoint

# initialize Hub LLM
llm = MyHuggingFaceEndpoint(
    max_new_tokens=250,
    repo_id='google/flan-t5-large',
    temperature=0,
    repetition_penalty=1.03,
    huggingfacehub_api_token=<YOUR HF KEY >,
    
)

print(llm.invoke("What is Deep Learning?", stop=None, watermark=None, return_full_text=None, stop_sequences=None))

This should work. The reason for using the wrapper is that you can't simply set the parameters to None in the invoke. Doing so causes an unsupported operation error because stop_sequences is being concatenated with an empty list at some point.

Warra07 avatar Oct 01 '24 11:10 Warra07

Solution: Using HuggingFaceHub with LangChain

I found an effective way to use the HuggingFaceHub model with LangChain. Instead of the previous method, we can simplify and enhance the configuration as follows:

Previous Method

from langchain_huggingface.llms import HuggingFaceEndpoint 

llm = HuggingFaceEndpoint(
    repo_id="google/flan-t5-large",
    temperature=0,
    max_new_tokens=250,
    huggingfacehub_api_token=HUGGINGFACE_API_TOKEN
)

Improved Method

This new method utilizes the HuggingFaceHub from LangChain with more detailed model configurations.

from langchain import HuggingFaceHub

llm = HuggingFaceHub(
    repo_id='google/flan-t5-base',
    model_kwargs={"temperature":0, "max_length":180, 'max_new_tokens' : 120, 'top_k' : 10, 'top_p': 0.95, 'repetition_penalty':1.03}
)

# using the model 
output = llm.invoke('Can you tell me the capital of russia')

print(output)

This works for me

chechoreyes avatar Nov 18 '24 00:11 chechoreyes

Hi, @michaelCHU95. I'm Dosu, and I'm helping the LangChain team manage their backlog. I'm marking this issue as stale.

Issue Summary:

  • You reported a bug where certain model_kwargs are not utilized, causing a "Bad request" error.
  • Users identified a recent commit as a potential cause, with default parameters being incorrectly injected.
  • Temporary resolutions include downgrading LangChain or using a custom wrapper.
  • An alternative method using HuggingFaceHub has been shared and worked for some users.

Next Steps:

  • Please confirm if this issue is still relevant with the latest version of LangChain. If so, you can keep the discussion open by commenting here.
  • If there is no further activity, this issue will be automatically closed in 7 days.

Thank you for your understanding and contribution!

dosubot[bot] avatar Feb 17 '25 01:02 dosubot[bot]

This issue is still there.

khameelbm avatar Apr 03 '25 05:04 khameelbm