langchain HuggingFacePipeline trust_remote_code:True allows download but not inference

System Info

Latest langchain version.

Who can help?

No response

Information

[X] The official example notebooks/scripts
[X] My own modified scripts

Related Components

[X] LLMs/Chat Models
[ ] Embedding Models
[ ] Prompts / Prompt Templates / Prompt Selectors
[ ] Output Parsers
[ ] Document Loaders
[ ] Vector Stores / Retrievers
[ ] Memory
[ ] Agents / Agent Executors
[ ] Tools / Toolkits
[ ] Chains
[ ] Callbacks/Tracing
[ ] Async

Reproduction

ValueError: Loading mosaicml/mpt-7b-chat requires you to execute the configuration file in that repo on your local machine. Make sure you have read the code there to avoid malicious use, then set the option trust_remote_code=True to remove this error.

Adding trust_remote_code":True to model_kwargs allows you to download the models throws TypeError: transformers.pipelines.base.infer_framework_load_model() got multiple values for keyword argument 'trust_remote_code' later on after the download.

lang_chain8.py

from langchain import HuggingFacePipeline

llm = HuggingFacePipeline.from_model_id(model_id="mosaicml/mpt-7b-chat", task="text-generation", model_kwargs={"temperature":0.1, "trust_remote_code":True})

from langchain import PromptTemplate,  LLMChain

template = """Question: {question}

Answer: Let's think step by step."""
prompt = PromptTemplate(template=template, input_variables=["question"])

llm_chain = LLMChain(prompt=prompt, llm=llm)

question = "What is electroencephalography?"

print(llm_chain.run(question))

python lang_chain8.py
Loading checkpoint shards: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [01:07<00:00, 33.72s/it]
Traceback (most recent call last):
  File "/Users/russellballestrini/git/flaskchat/lang_chain8.py", line 4, in <module>
    llm = HuggingFacePipeline.from_model_id(model_id="mosaicml/mpt-7b-chat", task="text-generation", model_kwargs={"temperature":0.1, "trust_remote_code": True})
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/russellballestrini/git/flaskchat/env/lib/python3.11/site-packages/langchain/llms/huggingface_pipeline.py", line 118, in from_model_id
    pipeline = hf_pipeline(
               ^^^^^^^^^^^^
  File "/Users/russellballestrini/git/flaskchat/env/lib/python3.11/site-packages/transformers/pipelines/__init__.py", line 779, in pipeline
    framework, model = infer_framework_load_model(
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: transformers.pipelines.base.infer_framework_load_model() got multiple values for keyword argument 'trust_remote_code'

Expected behavior

I expect to be able to run inference on any huggingface model even the untrusted ones.

May 08 '23 18:05 russellballestrini

@vowelparrot thanks for the patch, I tested it locally and hit a new error.

The model 'MPTForCausalLM' is not supported for text-generation. Supported models are ['BartForCausalLM', 'BertLMHeadModel', 'BertGenerationDecoder', 'BigBirdForCausalLM', 'BigBirdPegasusForCausalLM', 'BioGptForCausalLM', 'BlenderbotForCausalLM', 'BlenderbotSmallForCausalLM', 'BloomForCausalLM', 'CamembertForCausalLM', 'CodeGenForCausalLM', 'CpmAntForCausalLM', 'CTRLLMHeadModel', 'Data2VecTextForCausalLM', 'ElectraForCausalLM', 'ErnieForCausalLM', 'GitForCausalLM', 'GPT2LMHeadModel', 'GPT2LMHeadModel', 'GPTBigCodeForCausalLM', 'GPTNeoForCausalLM', 'GPTNeoXForCausalLM', 'GPTNeoXJapaneseForCausalLM', 'GPTJForCausalLM', 'LlamaForCausalLM', 'MarianForCausalLM', 'MBartForCausalLM', 'MegaForCausalLM', 'MegatronBertForCausalLM', 'MvpForCausalLM', 'OpenAIGPTLMHeadModel', 'OPTForCausalLM', 'PegasusForCausalLM', 'PLBartForCausalLM', 'ProphetNetForCausalLM', 'QDQBertLMHeadModel', 'ReformerModelWithLMHead', 'RemBertForCausalLM', 'RobertaForCausalLM', 'RobertaPreLayerNormForCausalLM', 'RoCBertForCausalLM', 'RoFormerForCausalLM', 'Speech2Text2ForCausalLM', 'TransfoXLLMHeadModel', 'TrOCRForCausalLM', 'XGLMForCausalLM', 'XLMWithLMHeadModel', 'XLMProphetNetForCausalLM', 'XLMRobertaForCausalLM', 'XLMRobertaXLForCausalLM', 'XLNetLMHeadModel', 'XmodForCausalLM'].

May 09 '23 14:05 russellballestrini

@russellballestrini have you tried with test2test-generation instead of text-generation?

May 10 '23 17:05 ptah23

Think the HF pipeline doesn't yet support the mosaic config rght now for either

May 10 '23 17:05 vowelparrot

I know this works model = AutoModelForCausalLM.from_pretrained("mosaicml/mpt-7b-instruct", trust_remote_code=True) but I want to deploy it on Endpoint.

Is there any easy way to deploy using above AutoModelForCausalLM approach ?

May 13 '23 04:05 Divjyot

can this be looked into, please?

Jun 03 '23 12:06 ksachdeva11

Hi, @russellballestrini! I'm Dosu, and I'm here to help the LangChain team manage their backlog. I wanted to let you know that we are marking this issue as stale.

From what I understand, the issue is related to setting trust_remote_code=True, which allows for model download but throws a TypeError during inference. There have been some suggestions and attempts to resolve the issue, such as a patch suggested by @vowelparrot, but it resulted in a new error. @ptah23 suggested trying test2test-generation instead of text-generation. Additionally, @Divjyot found a workaround for deployment using AutoModelForCausalLM, but is looking for an easier solution. @ksachdeva11 requested further investigation.

Before we close this issue, we wanted to check with you if it is still relevant to the latest version of the LangChain repository. If it is, please let us know by commenting on the issue. Otherwise, feel free to close the issue yourself or it will be automatically closed in 7 days.

Thank you for your understanding and contribution to the LangChain project!

Sep 15 '23 16:09 dosubot[bot]

langchain langchain copied to clipboard

HuggingFacePipeline trust_remote_code:True allows download but not inference

System Info

Who can help?

Information

Related Components

Reproduction

Expected behavior

langchain
langchain copied to clipboard