jupyter-ai icon indicating copy to clipboard operation
jupyter-ai copied to clipboard

Unable to use ChatVertexAI as provider due to default chat's ChatPromptTemplate

Open michaelchia opened this issue 2 years ago • 5 comments
trafficstars

Problem

  • I am unable to use ChatVertexAI as a Provider for chat as the final message ChatPromptTemplate is an AIMessage. https://github.com/jupyterlab/jupyter-ai/blob/0f619061430be6320bf9b87c63a1b939597998f9/packages/jupyter-ai/jupyter_ai/chat_handlers/default.py#L35-L52
  • Langchain's ChatVertexAI requires that the last message is a HumanMessage and not an AIMessage or any other type. https://github.com/hwchase17/langchain/blob/8f5eca236fa399bab81ee7a533cc37efd27a257d/langchain/chat_models/vertexai.py#L122-L126

Proposed Solution

  • I am not sure what the purpose of the empty AIMessage is. Perhaps review if it is absolutely necessary.

michaelchia avatar Jul 13 '23 04:07 michaelchia

@michaelchia Hey Michael, I had sent a reply but it appears to have been lost by GitHub 😭. Sorry for the late response.

The reason we add the empty AI message is to indicate that to the LLM that it should generate a response to the prompt instead of generating a continuation of the prompt. From our extensive testing with the providers we offer, we determined this to be necessary for certain providers like AI21 and Cohere. The empty AI message is also part of LangChain's default prompt template for conversation chains: https://github.com/langchain-ai/langchain/blob/master/libs/langchain/langchain/chains/conversation/prompt.py

Hence, I'm very surprised that the ChatVertexAI provider explicitly raises an error when the last message is an AI message, as this goes against the self-consistency of LangChain itself. Though I haven't used it, a glance at the code suggests that ChatVertexAI would fail with the default conversation chain. Since VertexAI seems to be the exception, I'm inclined to argue that the solution to this issue would be to add a custom Pydantic attribute on the ChatVertexAI provider to indicate it should not include an empty AI message suffixed. Then our backend will make sure to check that attribute before building the prompt template.

dlqqq avatar Jul 24 '23 16:07 dlqqq

Yep, Makes sense. I will argue not bothering even adding this extra attribute on your end, adds unnecessary complexity. Seems like a VertexAI issue that should be solved within their langchain object. On my end, I have a workaround that isn't too hacky (overriding the _generate method to remove that extra AIMessage). Thanks for the consideration.

michaelchia avatar Jul 24 '23 16:07 michaelchia

Haha, part of the reason we subclass all of the LangChain providers we offer is precisely to work around upstream issues until they're patched. We are inclined to add VertexAI to Jupyter AI, so for other users, the additional attribute would be necessary.

dlqqq avatar Jul 24 '23 16:07 dlqqq

See also #226 for customizing prompts for models/providers.

JasonWeill avatar Jul 24 '23 23:07 JasonWeill

Hi @dlqqq - Will from the LangChain team here - love what you all are doing with Jupyter AI! We'd love to set up a slack channel with your team to make sure we can prioritize fixes like this and that the modules you are using stably support the project. If you send an email to [email protected] we'll open that line of communication. Thank you!

I know Piyush has made a lot of contributions to the project as well :)

hinthornw avatar Aug 04 '23 14:08 hinthornw