langchain icon indicating copy to clipboard operation
langchain copied to clipboard

HuggingFaceEndpoint returning buggy responses and prompt template back

Open myke11j opened this issue 2 months ago • 2 comments

Checked other resources

  • [X] I added a very descriptive title to this issue.
  • [X] I searched the LangChain documentation with the integrated search.
  • [X] I used the GitHub search to find a similar question and didn't find it.
  • [X] I am sure that this is a bug in LangChain rather than my code.
  • [X] The bug is not resolved by updating to the latest stable version of LangChain (or the specific integration package).

Example Code

llm = HuggingFaceEndpoint(
    repo_id= 'meta-llama/Llama-3.1-8B-Instruct',
    huggingfacehub_api_token=api_token
)
llm_chain = (
    {
        "context": lambda inputs: retrieve_context(inputs['question'], inputs['vector']),
        "question": RunnablePassthrough()
    }
    | PromptTemplate(
        template=BASE_TEMPLATE,
        input_variables=["context", "question"]
    )
    | llm
    | StrOutputParser()
)

llm_chain.invoke({ 'question': user_query, 'vector': vector })

Error Message and Stack Trace (if applicable)

As you can see in langsmith, it returned this output.

image

Description

I'm using HuggingFaceEndpoint for inference to avoid storing model on my local machine, and I've noticed it gives buggy responses quite a few times. I'm using it for a RAG and a lot of times it just returns back the entire base prompt template inside [INST]...[/INST]. And as seen in screenshot attached, it returned "[/INST]" in a loop until max tokens limit reached.

System Info

System Information

OS: Darwin OS Version: Darwin Kernel Version 22.4.0: Mon Mar 6 21:00:17 PST 2023; root:xnu-8796.101.5~3/RELEASE_X86_64 Python Version: 3.12.6 (main, Sep 6 2024, 19:03:47) [Clang 15.0.0 (clang-1500.1.0.2.5)]

Package Information

langchain_core: 0.3.19 langchain: 0.3.7 langchain_community: 0.3.7 langsmith: 0.1.143 langchain_huggingface: 0.1.2 langchain_text_splitters: 0.3.2 langchainhub: 0.1.21

Optional packages not installed

langgraph langserve

Other Dependencies

aiohttp: 3.11.4 async-timeout: Installed. No version info available. dataclasses-json: 0.6.7 httpx: 0.27.2 httpx-sse: 0.4.0 huggingface-hub: 0.26.2 jsonpatch: 1.33 numpy: 1.26.4 orjson: 3.10.11 packaging: 24.2 pydantic: 2.9.2 pydantic-settings: 2.6.1 PyYAML: 6.0.2 requests: 2.32.3 requests-toolbelt: 1.0.0 sentence-transformers: 3.3.1 SQLAlchemy: 2.0.35 tenacity: 9.0.0 tokenizers: 0.20.3 transformers: 4.46.3 types-requests: 2.32.0.20241016 typing-extensions: 4.12.2

myke11j avatar Dec 06 '24 13:12 myke11j