langchain
langchain copied to clipboard
HuggingFacePipeline trust_remote_code:True allows download but not inference
System Info
Latest langchain version.
Who can help?
No response
Information
- [X] The official example notebooks/scripts
- [X] My own modified scripts
Related Components
- [X] LLMs/Chat Models
- [ ] Embedding Models
- [ ] Prompts / Prompt Templates / Prompt Selectors
- [ ] Output Parsers
- [ ] Document Loaders
- [ ] Vector Stores / Retrievers
- [ ] Memory
- [ ] Agents / Agent Executors
- [ ] Tools / Toolkits
- [ ] Chains
- [ ] Callbacks/Tracing
- [ ] Async
Reproduction
ValueError: Loading mosaicml/mpt-7b-chat requires you to execute the configuration file in that repo on your local machine. Make sure you have read the code there to avoid malicious use, then set the option trust_remote_code=True
to remove this error.
Adding trust_remote_code":True
to model_kwargs allows you to download the models throws TypeError: transformers.pipelines.base.infer_framework_load_model() got multiple values for keyword argument 'trust_remote_code' later on after the download.
lang_chain8.py
from langchain import HuggingFacePipeline
llm = HuggingFacePipeline.from_model_id(model_id="mosaicml/mpt-7b-chat", task="text-generation", model_kwargs={"temperature":0.1, "trust_remote_code":True})
from langchain import PromptTemplate, LLMChain
template = """Question: {question}
Answer: Let's think step by step."""
prompt = PromptTemplate(template=template, input_variables=["question"])
llm_chain = LLMChain(prompt=prompt, llm=llm)
question = "What is electroencephalography?"
print(llm_chain.run(question))
python lang_chain8.py
Loading checkpoint shards: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [01:07<00:00, 33.72s/it]
Traceback (most recent call last):
File "/Users/russellballestrini/git/flaskchat/lang_chain8.py", line 4, in <module>
llm = HuggingFacePipeline.from_model_id(model_id="mosaicml/mpt-7b-chat", task="text-generation", model_kwargs={"temperature":0.1, "trust_remote_code": True})
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/russellballestrini/git/flaskchat/env/lib/python3.11/site-packages/langchain/llms/huggingface_pipeline.py", line 118, in from_model_id
pipeline = hf_pipeline(
^^^^^^^^^^^^
File "/Users/russellballestrini/git/flaskchat/env/lib/python3.11/site-packages/transformers/pipelines/__init__.py", line 779, in pipeline
framework, model = infer_framework_load_model(
^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: transformers.pipelines.base.infer_framework_load_model() got multiple values for keyword argument 'trust_remote_code'
Expected behavior
I expect to be able to run inference on any huggingface model even the untrusted ones.
@vowelparrot thanks for the patch, I tested it locally and hit a new error.
The model 'MPTForCausalLM' is not supported for text-generation. Supported models are ['BartForCausalLM', 'BertLMHeadModel', 'BertGenerationDecoder', 'BigBirdForCausalLM', 'BigBirdPegasusForCausalLM', 'BioGptForCausalLM', 'BlenderbotForCausalLM', 'BlenderbotSmallForCausalLM', 'BloomForCausalLM', 'CamembertForCausalLM', 'CodeGenForCausalLM', 'CpmAntForCausalLM', 'CTRLLMHeadModel', 'Data2VecTextForCausalLM', 'ElectraForCausalLM', 'ErnieForCausalLM', 'GitForCausalLM', 'GPT2LMHeadModel', 'GPT2LMHeadModel', 'GPTBigCodeForCausalLM', 'GPTNeoForCausalLM', 'GPTNeoXForCausalLM', 'GPTNeoXJapaneseForCausalLM', 'GPTJForCausalLM', 'LlamaForCausalLM', 'MarianForCausalLM', 'MBartForCausalLM', 'MegaForCausalLM', 'MegatronBertForCausalLM', 'MvpForCausalLM', 'OpenAIGPTLMHeadModel', 'OPTForCausalLM', 'PegasusForCausalLM', 'PLBartForCausalLM', 'ProphetNetForCausalLM', 'QDQBertLMHeadModel', 'ReformerModelWithLMHead', 'RemBertForCausalLM', 'RobertaForCausalLM', 'RobertaPreLayerNormForCausalLM', 'RoCBertForCausalLM', 'RoFormerForCausalLM', 'Speech2Text2ForCausalLM', 'TransfoXLLMHeadModel', 'TrOCRForCausalLM', 'XGLMForCausalLM', 'XLMWithLMHeadModel', 'XLMProphetNetForCausalLM', 'XLMRobertaForCausalLM', 'XLMRobertaXLForCausalLM', 'XLNetLMHeadModel', 'XmodForCausalLM'].
@russellballestrini have you tried with test2test-generation instead of text-generation?
Think the HF pipeline doesn't yet support the mosaic config rght now for either
I know this works model = AutoModelForCausalLM.from_pretrained("mosaicml/mpt-7b-instruct", trust_remote_code=True)
but I want to deploy it on Endpoint.
Is there any easy way to deploy using above AutoModelForCausalLM
approach ?
can this be looked into, please?
Hi, @russellballestrini! I'm Dosu, and I'm here to help the LangChain team manage their backlog. I wanted to let you know that we are marking this issue as stale.
From what I understand, the issue is related to setting trust_remote_code=True
, which allows for model download but throws a TypeError
during inference. There have been some suggestions and attempts to resolve the issue, such as a patch suggested by @vowelparrot, but it resulted in a new error. @ptah23 suggested trying test2test-generation
instead of text-generation
. Additionally, @Divjyot found a workaround for deployment using AutoModelForCausalLM
, but is looking for an easier solution. @ksachdeva11 requested further investigation.
Before we close this issue, we wanted to check with you if it is still relevant to the latest version of the LangChain repository. If it is, please let us know by commenting on the issue. Otherwise, feel free to close the issue yourself or it will be automatically closed in 7 days.
Thank you for your understanding and contribution to the LangChain project!