langchain
langchain copied to clipboard
Verify new tools api code with GoogleAI
The PR #105 was merged to main
, and the tests pass, but I didn't run it against a GoogleAI LLM. I would appreciate help validating that.
@jadengis, this impacts your code the most. Just a heads-up.
@brainlid sure i should be able to run main
against my setup and see if it breaks anything. Are there any breaking changes to the API i'd need to integrate?
The big change is around function calls and function results.
An assistant message can contain 0 to many ToolCalls. A Tool message contains 1 or more ToolResults.
On Sat, Apr 27, 2024 at 12:32 AM John Dengis @.***> wrote:
@brainlid https://github.com/brainlid sure i should be able to run main against my setup and see if it breaks anything. Are there any breaking changes to the API i'd need to integrate?
— Reply to this email directly, view it on GitHub https://github.com/brainlid/langchain/issues/107#issuecomment-2080386892, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAGFQGAKV7OYQCSYAYZFTNLY7NBBVAVCNFSM6AAAAABG3WO4RSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAOBQGM4DMOBZGI . You are receiving this because you were mentioned.Message ID: @.***>
Looking forward to this as Gemini Pro APIs have become publicly available. This will also allow me to use this project as an abstraction for LLMs in my Elixir-port of Autogen which is a multi-agent framework from Microsoft.
Here are the Function Calling docs for Gemini models: https://ai.google.dev/gemini-api/docs/function-calling
@nileshtrivedi if you are interested I would appreciate some help testing out the above change. I tried it out in my project but ran into some errors with the new code and I couldn't get it working. I haven't had time to go back an fix the issues however.
@jadengis I thought of modifying the existing tools/calculator_test
for Google Gemini models but even the existing test for OpenAI seems to fail for me at this line because message.content
seems to be nil
. This is the message object when this assertion fails:
%LangChain.Message{content: nil, processed_content: nil, index: 0, status: :complete, role: :assistant, name: nil, tool_calls: [%LangChain.Message.ToolCall{status: :complete, type: :function, call_id: "call_9Fq4ZN3U4Ln4D52Kwg1DXj0G", name: "calculator", arguments: %{"expression" => "100 + 300 - 200"}, index: nil}], tool_results: nil}
I don't know whether it's a bug in the test code itself or something else. Unable to test further. 🫤
For my own work, even Autogen itself currently seems broken for Gemini models. I ended up using Python example code using Google's own SDK.
@nileshtrivedi I updated and fixed the Calculator tool and tests. Thanks for pointing that out!
https://github.com/brainlid/langchain/pull/132
@jadengis please let me know what errors you're getting! We can hopefully get it ironed out quickly.
For migrating, I tried to document what would be needed in the CHANGELOG. Let me know if you're finding gaps!
I submitted #135 as a failing test. While mix test test/chat_models/chat_google_ai_test.exs --include live_call
passes, mix test test/tools/calculator_gemini_test.exs --include live_call
fails with "Unexpected response" from the LLM.
I think there are multiple errors in calling Gemini model APIs properly:
- model name: "gemini-pro" is no longer valid. Valid models names (eg: "gemini-1.5-pro" / "gemini-1.5-flash") are listed here.
- Version is included twice in the URL. The request is being made to: https://generativelanguage.googleapis.com/v1beta/v1beta/models/gemini-1.5-flash:generateContent . Notice
v1beta
appearing twice in the URL. This is because the private methodbuild_url
appendsversion
toendpoint
which already includes the version. - Utils.fire_callback method has not been defined.
There may be more issues.
EDITED: Noticed this open PR for fixing the endpoint: https://github.com/brainlid/langchain/pull/118 Seems like there are subtle difference between Gemini API and VertexAI API which are causing these.
@nileshtrivedi We split out ChatVertexAI from ChatGoogleAI because the differences were subtle but throughout. In the published RC, the callbacks have been updated as well. Thank you for looking at it before. How does it look now?
Gemini API still seems to fail when testing with
I tested after making this change in test/tools/calculator_test.exs
:
--- a/test/tools/calculator_test.exs
+++ b/test/tools/calculator_test.exs
@@ -5,7 +5,7 @@ defmodule LangChain.Tools.CalculatorTest do
doctest LangChain.Tools.Calculator
alias LangChain.Tools.Calculator
alias LangChain.Function
- alias LangChain.ChatModels.ChatOpenAI
+ alias LangChain.ChatModels.ChatGoogleAI
alias LangChain.Message.ToolCall
alias LangChain.Message.ToolResult
@@ -80,7 +80,7 @@ defmodule LangChain.Tools.CalculatorTest do
end
}
- model = ChatOpenAI.new!(%{seed: 0, temperature: 0, stream: false, callbacks: [llm_handler]})
+ model = ChatGoogleAI.new!(%{model: "gemini-1.5-pro", api_key: System.fetch_env!("GEMINI_API_KEY"), seed: 0, temperature: 0, stream: false, callbacks: [llm_handler]})
This is the test failure I get:
% mix test test/tools/calculator_test.exs --include live_call
Compiling 1 file (.ex)
Including tags: [:live_call]
.
1) test live test performs repeated calls until complete with a live LLM (LangChain.Tools.CalculatorTest)
test/tools/calculator_test.exs:68
** (MatchError) no match of right hand side value: {:error, %LangChain.Chains.LLMChain{llm: %LangChain.ChatModels.ChatGoogleAI{endpoint: "https://generativelanguage.googleapis.com/v1beta", api_version: "v1beta", model: "gemini-1.5-pro", api_key: "REMOVED_FOR_SAFETY", temperature: 0.0, top_p: 1.0, top_k: 1.0, receive_timeout: 60000, stream: false, callbacks: [%{on_llm_new_message: #Function<1.19352243/2 in LangChain.Tools.CalculatorTest."test live test performs repeated calls until complete with a live LLM"/1>}]}, verbose: false, verbose_deltas: false, tools: [%LangChain.Function{name: "calculator", description: "Perform basic math calculations or expressions", display_text: nil, function: #Function<0.76765322/2 in LangChain.Tools.Calculator.execute>, async: true, parameters_schema: %{type: "object", required: ["expression"], properties: %{expression: %{type: "string", description: "A simple mathematical expression"}}}, parameters: []}], _tool_map: %{"calculator" => %LangChain.Function{name: "calculator", description: "Perform basic math calculations or expressions", display_text: nil, function: #Function<0.76765322/2 in LangChain.Tools.Calculator.execute>, async: true, parameters_schema: %{type: "object", required: ["expression"], properties: %{expression: %{type: "string", description: "A simple mathematical expression"}}}, parameters: []}}, messages: [%LangChain.Message{content: "Answer the following math question: What is 100 + 300 - 200?", processed_content: nil, index: nil, status: :complete, role: :user, name: nil, tool_calls: [], tool_results: nil}], custom_context: nil, message_processors: [], max_retry_count: 3, current_failure_count: 0, delta: nil, last_message: %LangChain.Message{content: "Answer the following math question: What is 100 + 300 - 200?", processed_content: nil, index: nil, status: :complete, role: :user, name: nil, tool_calls: [], tool_results: nil}, needs_response: true, callbacks: [%{on_tool_response_created: #Function<2.19352243/2 in LangChain.Tools.CalculatorTest."test live test performs repeated calls until complete with a live LLM"/1>}]}, "Unexpected response"}
code: {:ok, updated_chain, %Message{} = message} =
stacktrace:
test/tools/calculator_test.exs:85: (test)
The following output was logged:
14:01:18.910 [error] Trying to process an unexpected response. ""
14:01:18.910 [error] Error during chat call. Reason: "Unexpected response"
......
Finished in 1.8 seconds (0.00s async, 1.8s sync)
8 tests, 1 failure
Randomized with seed 98474
I confirmed that my actual api_key was printed where it says REMOVED_FOR_SAFETY
. I also tried model: "gemini-1.5-flash"
but got the same error.
I think it might be easier if you can signup on https://ai.google.dev/ to get an API key to help with testing?
Hello @nileshtrivedi,
If you change these lines: https://github.com/brainlid/langchain/blob/d63e11a3e926450bcb2e983ae64c135ccb52c822/lib/chat_models/chat_google_ai.ex#L27-L29
by
@default_endpoint "https://generativelanguage.googleapis.com"
@default_api_version "v1beta"
Do the tests pass?
@ljgago No, it fails but with a different error:
% mix test test/tools/calculator_test.exs --include live_call
Compiling 39 files (.ex)
Generated langchain app
Including tags: [:live_call]
....
1) test live test performs repeated calls until complete with a live LLM (LangChain.Tools.CalculatorTest)
test/tools/calculator_test.exs:68
** (LangChain.LangChainError) content: is invalid for role tool
code: |> LLMChain.run(mode: :while_needs_response)
stacktrace:
(langchain 0.3.0-rc.0) lib/message.ex:408: LangChain.Message.new_tool_result!/1
(langchain 0.3.0-rc.0) lib/chains/llm_chain.ex:692: LangChain.Chains.LLMChain.execute_tool_calls/2
(langchain 0.3.0-rc.0) lib/chains/llm_chain.ex:321: LangChain.Chains.LLMChain.run_while_needs_response/1
test/tools/calculator_test.exs:95: (test)
The following output was logged:
06:58:59.287 [debug] Executing function "calculator"
...
Finished in 2.6 seconds (0.00s async, 2.6s sync)
8 tests, 1 failure
Randomized with seed 604620
I am happy to get on a call with any devs to work this out.
@nileshtrivedi I setup a GoogleAI account and got an API key. There are multiple issues with the ChatGoogleAI implementation.
I've fixed a couple locally (not pushed yet), but there's an issue with the ToolResult handling that I'm still trying to figure out.
A big issue is that Google's API docs are really poor. Their API also does things I haven't seen any other API do (aka. odd behaviors). All in all, I don't like Google's service! :grimacing:
Still, I acknowledge your issue, it is valid, and I hope to have a resolution sometime soon. Thanks!
@nileshtrivedi This is hopefully fixed now! :crossed_fingers:
Just merged PR #152 to main
. If you test using main it should be working now. Please let me know!
Thanks @brainlid ! This definitely seems to have improved as now ChatGoogleAI responses are being collected. To the user's message Answer the following math question: What is 100 + 300 - 200?
, ChatGoogleAI returns The answer is 200.
.
However, there seem to be differences between ChatOpenAI and ChatGoogleAI return values. Specifically, I get a test failure on this line:
1) test live test performs repeated calls until complete with a live LLM (LangChain.Tools.CalculatorTest)
test/tools/calculator_test.exs:68
** (FunctionClauseError) no function clause matching in Kernel.=~/2
The following arguments were given to Kernel.=~/2:
# 1
[%LangChain.Message.ContentPart{type: :text, content: "The answer is 200.", options: nil}]
# 2
"100 + 300 - 200"
Attempted function clauses (showing 3 out of 3):
def =~(left, "") when is_binary(left)
def =~(left, right) when is_binary(left) and is_binary(right)
def =~(left, right) when is_binary(left)
code: assert message.content =~ "100 + 300 - 200"
With ChatGoogleAI, instead of message.content
being a string, it is actually [%LangChain.Message.ContentPart{type: :text, content: "The answer is 200.", options: nil}]
.
Also, Gemini's response is simply The answer is 200.
so this line would fail anyway. Also, message.tool_calls
seems to be blank.
I think it may require setting the function calling mode to ANY
as per Gemini docs.
My bad, I didn't notice the new tests added (including the one for tool calling). mix test test/chat_models/chat_google_ai_test.exs --include live_google_ai
is passing all the tests.
Kudos @brainlid for following up and fixing this! 👏
@nileshtrivedi I'm pretty annoyed by the GoogleAI honestly. It's such an oddball compared to others. The API docs are sparse and difficult to use too. Ugh.
One odd thing that you noticed is the assistant returns the contents in parts every time. That's just a decision they made. We could pattern match on that and if there is only a single text part, flatten it to be content: "the text"
.
I haven't used it enough to see it return anything else though. Have you? If it makes sense, then doing that would make it easier to swap out the backend AI without impacting an application.
I updated the LangChain.Utils.ChainResult.to_string
to match on that make it easier to get the answer out.