langchain icon indicating copy to clipboard operation
langchain copied to clipboard

Verify new tools api code with GoogleAI

Open brainlid opened this issue 9 months ago • 2 comments

The PR #105 was merged to main, and the tests pass, but I didn't run it against a GoogleAI LLM. I would appreciate help validating that.

@jadengis, this impacts your code the most. Just a heads-up.

brainlid avatar Apr 27 '24 02:04 brainlid

@brainlid sure i should be able to run main against my setup and see if it breaks anything. Are there any breaking changes to the API i'd need to integrate?

jadengis avatar Apr 27 '24 06:04 jadengis

The big change is around function calls and function results.

An assistant message can contain 0 to many ToolCalls. A Tool message contains 1 or more ToolResults.

On Sat, Apr 27, 2024 at 12:32 AM John Dengis @.***> wrote:

@brainlid https://github.com/brainlid sure i should be able to run main against my setup and see if it breaks anything. Are there any breaking changes to the API i'd need to integrate?

— Reply to this email directly, view it on GitHub https://github.com/brainlid/langchain/issues/107#issuecomment-2080386892, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAGFQGAKV7OYQCSYAYZFTNLY7NBBVAVCNFSM6AAAAABG3WO4RSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAOBQGM4DMOBZGI . You are receiving this because you were mentioned.Message ID: @.***>

brainlid avatar Apr 27 '24 23:04 brainlid

Looking forward to this as Gemini Pro APIs have become publicly available. This will also allow me to use this project as an abstraction for LLMs in my Elixir-port of Autogen which is a multi-agent framework from Microsoft.

Here are the Function Calling docs for Gemini models: https://ai.google.dev/gemini-api/docs/function-calling

nileshtrivedi avatar Jun 06 '24 03:06 nileshtrivedi

@nileshtrivedi if you are interested I would appreciate some help testing out the above change. I tried it out in my project but ran into some errors with the new code and I couldn't get it working. I haven't had time to go back an fix the issues however.

jadengis avatar Jun 06 '24 07:06 jadengis

@jadengis I thought of modifying the existing tools/calculator_test for Google Gemini models but even the existing test for OpenAI seems to fail for me at this line because message.content seems to be nil. This is the message object when this assertion fails:

%LangChain.Message{content: nil, processed_content: nil, index: 0, status: :complete, role: :assistant, name: nil, tool_calls: [%LangChain.Message.ToolCall{status: :complete, type: :function, call_id: "call_9Fq4ZN3U4Ln4D52Kwg1DXj0G", name: "calculator", arguments: %{"expression" => "100 + 300 - 200"}, index: nil}], tool_results: nil}

I don't know whether it's a bug in the test code itself or something else. Unable to test further. 🫤

For my own work, even Autogen itself currently seems broken for Gemini models. I ended up using Python example code using Google's own SDK.

nileshtrivedi avatar Jun 06 '24 07:06 nileshtrivedi

@nileshtrivedi I updated and fixed the Calculator tool and tests. Thanks for pointing that out!

https://github.com/brainlid/langchain/pull/132

brainlid avatar Jun 06 '24 12:06 brainlid

@jadengis please let me know what errors you're getting! We can hopefully get it ironed out quickly.

brainlid avatar Jun 06 '24 12:06 brainlid

For migrating, I tried to document what would be needed in the CHANGELOG. Let me know if you're finding gaps!

brainlid avatar Jun 06 '24 13:06 brainlid

I submitted #135 as a failing test. While mix test test/chat_models/chat_google_ai_test.exs --include live_call passes, mix test test/tools/calculator_gemini_test.exs --include live_call fails with "Unexpected response" from the LLM.

nileshtrivedi avatar Jun 11 '24 12:06 nileshtrivedi

I think there are multiple errors in calling Gemini model APIs properly:

  • model name: "gemini-pro" is no longer valid. Valid models names (eg: "gemini-1.5-pro" / "gemini-1.5-flash") are listed here.
  • Version is included twice in the URL. The request is being made to: https://generativelanguage.googleapis.com/v1beta/v1beta/models/gemini-1.5-flash:generateContent . Notice v1beta appearing twice in the URL. This is because the private method build_url appends version to endpoint which already includes the version.
  • Utils.fire_callback method has not been defined.

There may be more issues.

EDITED: Noticed this open PR for fixing the endpoint: https://github.com/brainlid/langchain/pull/118 Seems like there are subtle difference between Gemini API and VertexAI API which are causing these.

nileshtrivedi avatar Jun 14 '24 05:06 nileshtrivedi

@nileshtrivedi We split out ChatVertexAI from ChatGoogleAI because the differences were subtle but throughout. In the published RC, the callbacks have been updated as well. Thank you for looking at it before. How does it look now?

brainlid avatar Jun 19 '24 13:06 brainlid

Gemini API still seems to fail when testing with

I tested after making this change in test/tools/calculator_test.exs:

--- a/test/tools/calculator_test.exs
+++ b/test/tools/calculator_test.exs
@@ -5,7 +5,7 @@ defmodule LangChain.Tools.CalculatorTest do
   doctest LangChain.Tools.Calculator
   alias LangChain.Tools.Calculator
   alias LangChain.Function
-  alias LangChain.ChatModels.ChatOpenAI
+  alias LangChain.ChatModels.ChatGoogleAI
   alias LangChain.Message.ToolCall
   alias LangChain.Message.ToolResult
 
@@ -80,7 +80,7 @@ defmodule LangChain.Tools.CalculatorTest do
         end
       }
 
-      model = ChatOpenAI.new!(%{seed: 0, temperature: 0, stream: false, callbacks: [llm_handler]})
+      model = ChatGoogleAI.new!(%{model: "gemini-1.5-pro", api_key: System.fetch_env!("GEMINI_API_KEY"), seed: 0, temperature: 0, stream: false, callbacks: [llm_handler]})

This is the test failure I get:

% mix test test/tools/calculator_test.exs --include live_call
Compiling 1 file (.ex)
Including tags: [:live_call]

.

  1) test live test performs repeated calls until complete with a live LLM (LangChain.Tools.CalculatorTest)
     test/tools/calculator_test.exs:68
     ** (MatchError) no match of right hand side value: {:error, %LangChain.Chains.LLMChain{llm: %LangChain.ChatModels.ChatGoogleAI{endpoint: "https://generativelanguage.googleapis.com/v1beta", api_version: "v1beta", model: "gemini-1.5-pro", api_key: "REMOVED_FOR_SAFETY", temperature: 0.0, top_p: 1.0, top_k: 1.0, receive_timeout: 60000, stream: false, callbacks: [%{on_llm_new_message: #Function<1.19352243/2 in LangChain.Tools.CalculatorTest."test live test performs repeated calls until complete with a live LLM"/1>}]}, verbose: false, verbose_deltas: false, tools: [%LangChain.Function{name: "calculator", description: "Perform basic math calculations or expressions", display_text: nil, function: #Function<0.76765322/2 in LangChain.Tools.Calculator.execute>, async: true, parameters_schema: %{type: "object", required: ["expression"], properties: %{expression: %{type: "string", description: "A simple mathematical expression"}}}, parameters: []}], _tool_map: %{"calculator" => %LangChain.Function{name: "calculator", description: "Perform basic math calculations or expressions", display_text: nil, function: #Function<0.76765322/2 in LangChain.Tools.Calculator.execute>, async: true, parameters_schema: %{type: "object", required: ["expression"], properties: %{expression: %{type: "string", description: "A simple mathematical expression"}}}, parameters: []}}, messages: [%LangChain.Message{content: "Answer the following math question: What is 100 + 300 - 200?", processed_content: nil, index: nil, status: :complete, role: :user, name: nil, tool_calls: [], tool_results: nil}], custom_context: nil, message_processors: [], max_retry_count: 3, current_failure_count: 0, delta: nil, last_message: %LangChain.Message{content: "Answer the following math question: What is 100 + 300 - 200?", processed_content: nil, index: nil, status: :complete, role: :user, name: nil, tool_calls: [], tool_results: nil}, needs_response: true, callbacks: [%{on_tool_response_created: #Function<2.19352243/2 in LangChain.Tools.CalculatorTest."test live test performs repeated calls until complete with a live LLM"/1>}]}, "Unexpected response"}
     code: {:ok, updated_chain, %Message{} = message} =
     stacktrace:
       test/tools/calculator_test.exs:85: (test)

     The following output was logged:
     
     14:01:18.910 [error] Trying to process an unexpected response. ""
     
     14:01:18.910 [error] Error during chat call. Reason: "Unexpected response"
     
......
Finished in 1.8 seconds (0.00s async, 1.8s sync)
8 tests, 1 failure

Randomized with seed 98474

I confirmed that my actual api_key was printed where it says REMOVED_FOR_SAFETY. I also tried model: "gemini-1.5-flash" but got the same error.

I think it might be easier if you can signup on https://ai.google.dev/ to get an API key to help with testing?

nileshtrivedi avatar Jun 22 '24 08:06 nileshtrivedi

Hello @nileshtrivedi,

If you change these lines: https://github.com/brainlid/langchain/blob/d63e11a3e926450bcb2e983ae64c135ccb52c822/lib/chat_models/chat_google_ai.ex#L27-L29

by

  @default_endpoint "https://generativelanguage.googleapis.com"
  @default_api_version "v1beta"

Do the tests pass?

ljgago avatar Jun 28 '24 00:06 ljgago

@ljgago No, it fails but with a different error:

% mix test test/tools/calculator_test.exs --include live_call              
Compiling 39 files (.ex)
Generated langchain app
Including tags: [:live_call]

....

  1) test live test performs repeated calls until complete with a live LLM (LangChain.Tools.CalculatorTest)
     test/tools/calculator_test.exs:68
     ** (LangChain.LangChainError) content: is invalid for role tool
     code: |> LLMChain.run(mode: :while_needs_response)
     stacktrace:
       (langchain 0.3.0-rc.0) lib/message.ex:408: LangChain.Message.new_tool_result!/1
       (langchain 0.3.0-rc.0) lib/chains/llm_chain.ex:692: LangChain.Chains.LLMChain.execute_tool_calls/2
       (langchain 0.3.0-rc.0) lib/chains/llm_chain.ex:321: LangChain.Chains.LLMChain.run_while_needs_response/1
       test/tools/calculator_test.exs:95: (test)

     The following output was logged:
     
     06:58:59.287 [debug] Executing function "calculator"
     
...
Finished in 2.6 seconds (0.00s async, 2.6s sync)
8 tests, 1 failure

Randomized with seed 604620

I am happy to get on a call with any devs to work this out.

nileshtrivedi avatar Jun 28 '24 01:06 nileshtrivedi

@nileshtrivedi I setup a GoogleAI account and got an API key. There are multiple issues with the ChatGoogleAI implementation.

I've fixed a couple locally (not pushed yet), but there's an issue with the ToolResult handling that I'm still trying to figure out.

A big issue is that Google's API docs are really poor. Their API also does things I haven't seen any other API do (aka. odd behaviors). All in all, I don't like Google's service! :grimacing:

Still, I acknowledge your issue, it is valid, and I hope to have a resolution sometime soon. Thanks!

brainlid avatar Jul 05 '24 20:07 brainlid

@nileshtrivedi This is hopefully fixed now! :crossed_fingers:

Just merged PR #152 to main. If you test using main it should be working now. Please let me know!

brainlid avatar Jul 06 '24 01:07 brainlid

Thanks @brainlid ! This definitely seems to have improved as now ChatGoogleAI responses are being collected. To the user's message Answer the following math question: What is 100 + 300 - 200?, ChatGoogleAI returns The answer is 200..

However, there seem to be differences between ChatOpenAI and ChatGoogleAI return values. Specifically, I get a test failure on this line:

1) test live test performs repeated calls until complete with a live LLM (LangChain.Tools.CalculatorTest)
     test/tools/calculator_test.exs:68
     ** (FunctionClauseError) no function clause matching in Kernel.=~/2

     The following arguments were given to Kernel.=~/2:

         # 1
         [%LangChain.Message.ContentPart{type: :text, content: "The answer is 200.", options: nil}]

         # 2
         "100 + 300 - 200"

     Attempted function clauses (showing 3 out of 3):

         def =~(left, "") when is_binary(left)
         def =~(left, right) when is_binary(left) and is_binary(right)
         def =~(left, right) when is_binary(left)

     code: assert message.content =~ "100 + 300 - 200"

With ChatGoogleAI, instead of message.content being a string, it is actually [%LangChain.Message.ContentPart{type: :text, content: "The answer is 200.", options: nil}].

Also, Gemini's response is simply The answer is 200. so this line would fail anyway. Also, message.tool_calls seems to be blank.

nileshtrivedi avatar Jul 06 '24 11:07 nileshtrivedi

I think it may require setting the function calling mode to ANY as per Gemini docs.

nileshtrivedi avatar Jul 06 '24 11:07 nileshtrivedi

My bad, I didn't notice the new tests added (including the one for tool calling). mix test test/chat_models/chat_google_ai_test.exs --include live_google_ai is passing all the tests.

Kudos @brainlid for following up and fixing this! 👏

nileshtrivedi avatar Jul 06 '24 11:07 nileshtrivedi

@nileshtrivedi I'm pretty annoyed by the GoogleAI honestly. It's such an oddball compared to others. The API docs are sparse and difficult to use too. Ugh.

One odd thing that you noticed is the assistant returns the contents in parts every time. That's just a decision they made. We could pattern match on that and if there is only a single text part, flatten it to be content: "the text".

I haven't used it enough to see it return anything else though. Have you? If it makes sense, then doing that would make it easier to swap out the backend AI without impacting an application.

I updated the LangChain.Utils.ChainResult.to_string to match on that make it easier to get the answer out.

brainlid avatar Jul 06 '24 14:07 brainlid