adk-go icon indicating copy to clipboard operation
adk-go copied to clipboard

feat: add support for OpenAI-compatible third-party provider

Open ChinaSiro opened this issue 1 month ago • 15 comments

Fixes #341

Key requirements:

  • Accept OpenAI Chat/Completions request schema (model/messages/stream/etc.)
  • Ensure compatibility with third-party providers using the OpenAI format (e.g., self-hosted gateways, multi-model routers)
  • Support text, image, audio, video, and file inputs

ChinaSiro avatar Nov 22 '25 22:11 ChinaSiro

Summary of Changes

Hello @ChinaSiro, I'm Gemini Code Assist[^1]! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces a new openai package that provides an OpenAI-compatible model adapter. This adapter allows the system to seamlessly integrate with a wide range of third-party LLM providers that adhere to the OpenAI Chat/Completions API specification. The implementation includes robust request and response conversion, enabling support for advanced features like streaming, function calling, and multi-modal inputs, significantly expanding the system's flexibility and interoperability with the broader AI ecosystem.

Highlights

  • OpenAI Compatibility: Introduces a new openai package that acts as an adapter for OpenAI-compatible APIs, allowing the model.LLM interface to interact with various third-party LLM providers.
  • Comprehensive Feature Support: The adapter supports the OpenAI Chat/Completions request schema, including system instructions, streaming responses, and function calling capabilities.
  • Multi-modal Input Handling: Extends capabilities to process diverse input types such as text, images, audio, video, and PDF files by converting genai.Blob data into appropriate OpenAI content formats.
  • Detailed Example: A new main.go example is added to showcase the usage of the OpenAI-compatible model, demonstrating all supported features and multi-modal inputs.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with :thumbsup: and :thumbsdown: on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

[^1]: Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

gemini-code-assist[bot] avatar Nov 22 '25 22:11 gemini-code-assist[bot]

This update introduces two additional environment variables:

  • OPENAI_API_KEY – the API key for the OpenAI-compatible endpoint
  • OPENAI_BASE_URL – should include the full base path, e.g. https://api.example.com/v1

The API design intentionally keeps the same style as the existing Gemini adapter. For example:

model, err := gemini.NewModel(ctx, "gemini-2.5-flash", &genai.ClientConfig{
    APIKey: os.Getenv("GOOGLE_API_KEY"),
})

When switching to the OpenAI provider, developers only need to replace:

  • gemini.NewModel → openai.NewModel
  • genai.ClientConfig → openai.ClientConfig
  • GOOGLE_API_KEY → OPENAI_API_KEY
model, err := openai.NewModel(ctx, "any-model", &openai.ClientConfig{
    APIKey: os.Getenv("OPENAI_API_KEY"),
})

This keeps the usage consistent with the Gemini adapter and minimizes migration cost.

ChinaSiro avatar Nov 23 '25 08:11 ChinaSiro

Tested function calling examplesexamples/tools/multipletools (remove search tool) and examples/tools/loadartifacts with openai models (test both openrouter and openai), seems doesn't work. but agent tool works.

cpunion avatar Nov 26 '25 08:11 cpunion

Tested function calling examplesexamples/tools/multipletools (remove search tool) and examples/tools/loadartifacts with openai models (test both openrouter and openai), seems doesn't work. but agent tool works.

Thanks for testing this! I’ve now added ParametersJsonSchema support for function calls, and the updated implementation has been tested successfully. The agent tools continue to work as expected.

It's better for making BaseURL optional.

Since this implementation is primarily designed to be OpenAI-compatible, using an explicit BaseURL configuration is more appropriate. For that reason, we don't provide a default fallback when the value is missing.

ChinaSiro avatar Nov 26 '25 12:11 ChinaSiro

Thanks for testing this! I’ve now added ParametersJsonSchema support for function calls, and the updated implementation has been tested successfully. The agent tools continue to work as expected.

Thanks for your great work! I just tested OpenRouter and OpenAI models, it works!

Since this implementation is primarily designed to be OpenAI-compatible, using an explicit BaseURL configuration is more appropriate. For that reason, we don't provide a default fallback when the value is missing.

I understand.

BTW, some of latest models (e.g. gpt-5-codex) only supports responses API, would you plan to support both chat completion and responses API? maybe switch with a config field. There is another PR using responses API https://github.com/google/adk-go/pull/242

cpunion avatar Nov 26 '25 13:11 cpunion

BTW, some of latest models (e.g. gpt-5-codex) only supports responses API, would you plan to support both chat completion and responses API? maybe switch with a config field. There is another PR using responses API #242

Thanks for the suggestion! It’s true that some newer models (e.g., gpt-5-codex) only support the Responses API. But currently only OpenAI fully supports it, while most other providers — including local LLMs and third-party services — still rely on the traditional Chat Completions API.

Adding Responses logic directly into this file could introduce extra complexity and potentially affect compatibility with those providers. To keep things clean, I think it’s better handled in a separate follow-up.

My plan:

  • Keep openai.go unchanged
  • Add a UseResponsesAPI bool flag
  • Route to a new openai_responses.go when enabled
  • Ensure both paths stay clean and independent
  • After this PR is merged, I’ll open another PR to add full Responses API support.

ChinaSiro avatar Nov 26 '25 15:11 ChinaSiro

@ChinaSiro I have tested on a demo but it stopped at function calling in streaming mode, maybe need a rich demo and integration test.

I attach my demo code for a example (AI generated), you can run it with options:

research.zip

➜  research git:(feat/openai-compatible-provider) ✗ go run . -help
Usage of research:
  -app string
    	App name for the session (default "research-demo")
  -model string
    	Model name with provider, for example: gemini:gemini-2.5-flash
  -prompt string
    	Prompt to send to the agent
  -session string
    	Existing session ID; leave empty to create one
  -stream
    	Enable streaming mode
  -user string
    	User ID to associate with the session (default "demo-user")

-model accepts gemini:GEMINI-MODEL-NAME, openrouter:OPENROUTER-MODEL-NAME, openai:OPENAI-MODEL-NAME, and you can switch streaming mode with stream.

I just tested (need set OPENROUTER_API_KEY, OPENAI_API_KEY, TAVILY_API_KEY, GEMINI_API_KEY (or GOOGLE_API_KEY):

# Works
$ go run . -prompt "search TESLA stock price, and get weather of New York" -model "openai:gpt-5-mini"

# Doesn't work
$ go run . -prompt "search TESLA stock price, and get weather of New York" -model "openai:gpt-5-mini" -stream

# Works
$ go run . -prompt "search TESLA stock price, and get weather of New York" -model "openrouter:openai/gpt-5-mini"

# Doesn't work
$ go run . -prompt "search TESLA stock price, and get weather of New York" -model "openrouter:openai/gpt-5-mini" -stream

# Works
$ go run . -prompt "search TESLA stock price, and get weather of New York" -model "gemini:gemini-2.5-flash"

# Works
$ go run . -prompt "search TESLA stock price, and get weather of New York" -model "gemini:gemini-2.5-flash" -stream

BTW: I can review and do some tests, but I have no write permissions.

cpunion avatar Nov 27 '25 03:11 cpunion

@ChinaSiro I have tested on a demo but it stopped at function calling in streaming mode, maybe need a rich demo and integration test.

Your tests are correct — the current Go implementation in this PR is essentially a re-translation of the Python version. I compared it against the Python code yesterday, and the logic in this PR is already aligned with the ADK-Python implementation.

To fully support tool calling in streaming mode, additional changes across other layers would likely be required, not just within this adapter. However, based on the existing ADK-Go examples, everything seems to be working as expected under the current design.

(Reference: lite_llm.py in ADK-Python) https://github.com/google/adk-python/blob/main/src/google/adk/models/lite_llm.py

ChinaSiro avatar Nov 29 '25 06:11 ChinaSiro

I just use codex to compare openai.go and lite_llm.py to try to find the reason, it update some code, and it works on my machine. Can you review the patch below?

Click to expand
diff --git a/model/openai/openai.go b/model/openai/openai.go
index 64e4fdb..1afb4de 100644
--- a/model/openai/openai.go
+++ b/model/openai/openai.go
@@ -146,6 +146,7 @@ type openAIMessage struct {
 
 type openAIToolCall struct {
 	ID       string             `json:"id"`
+	Index    *int               `json:"index,omitempty"`
 	Type     string             `json:"type"` // "function"
 	Function openAIFunctionCall `json:"function"`
 }
@@ -585,20 +586,24 @@ func (m *openAIModel) generateStream(ctx context.Context, openaiReq *openAIReque
 			// Handle tool calls
 			if len(delta.ToolCalls) > 0 {
 				for idx, tc := range delta.ToolCalls {
+					targetIdx := idx
+					if tc.Index != nil {
+						targetIdx = *tc.Index
+					}
 					// Ensure we have enough space in toolCalls slice
-					for len(toolCalls) <= idx {
+					for len(toolCalls) <= targetIdx {
 						toolCalls = append(toolCalls, openAIToolCall{})
 					}
 					if tc.ID != "" {
-						toolCalls[idx].ID = tc.ID
+						toolCalls[targetIdx].ID = tc.ID
 					}
 					if tc.Type != "" {
-						toolCalls[idx].Type = tc.Type
+						toolCalls[targetIdx].Type = tc.Type
 					}
 					if tc.Function.Name != "" {
-						toolCalls[idx].Function.Name = tc.Function.Name
+						toolCalls[targetIdx].Function.Name += tc.Function.Name
 					}
-					toolCalls[idx].Function.Arguments += tc.Function.Arguments
+					toolCalls[targetIdx].Function.Arguments += tc.Function.Arguments
 				}
 			}
 
@@ -720,6 +725,9 @@ func (m *openAIModel) convertResponse(resp *openAIResponse) (*model.LLMResponse,
 
 	// Handle tool calls
 	for _, tc := range toolCalls {
+		if tc.ID == "" && tc.Function.Name == "" && tc.Function.Arguments == "" {
+			continue
+		}
 		var args map[string]any
 		if err := json.Unmarshal([]byte(tc.Function.Arguments), &args); err != nil {
 			return nil, fmt.Errorf("failed to unmarshal tool arguments: %w", err)
@@ -753,6 +761,9 @@ func (m *openAIModel) buildFinalResponse(text string, toolCalls []openAIToolCall
 	}
 
 	for _, tc := range toolCalls {
+		if tc.ID == "" && tc.Function.Name == "" && tc.Function.Arguments == "" {
+			continue
+		}
 		var args map[string]any
 		if err := json.Unmarshal([]byte(tc.Function.Arguments), &args); err != nil {
 			continue
diff --git a/model/openai/openai_test.go b/model/openai/openai_test.go
index 22f37a6..74ec17e 100644
--- a/model/openai/openai_test.go
+++ b/model/openai/openai_test.go
@@ -249,6 +249,255 @@ func TestModel_Generate(t *testing.T) {
 	}
 }
 
+func TestModel_GenerateStream_WithMultipleToolCalls(t *testing.T) {
+	chunks := []map[string]any{
+		{
+			"id": "chatcmpl-test",
+			"choices": []any{
+				map[string]any{
+					"index": 0,
+					"delta": map[string]any{
+						"tool_calls": []any{
+							map[string]any{
+								"index": 0,
+								"id":    "call_a",
+								"type":  "function",
+								"function": map[string]any{
+									"name":      "tool_a",
+									"arguments": `{"task":"one"}`,
+								},
+							},
+						},
+					},
+				},
+			},
+		},
+		{
+			"id": "chatcmpl-test",
+			"choices": []any{
+				map[string]any{
+					"index": 0,
+					"delta": map[string]any{
+						"tool_calls": []any{
+							map[string]any{
+								"index": 1,
+								"id":    "call_b",
+								"type":  "function",
+								"function": map[string]any{
+									"name":      "tool_b",
+									"arguments": `{"task":"two"}`,
+								},
+							},
+						},
+					},
+				},
+			},
+		},
+		{
+			"id": "chatcmpl-test",
+			"choices": []any{
+				map[string]any{
+					"index":         0,
+					"delta":         map[string]any{},
+					"finish_reason": "tool_calls",
+				},
+			},
+		},
+	}
+
+	server := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
+		w.Header().Set("Content-Type", "text/event-stream")
+		flusher, ok := w.(http.Flusher)
+		if !ok {
+			t.Fatal("expected http.Flusher")
+		}
+		for _, chunk := range chunks {
+			data, err := json.Marshal(chunk)
+			if err != nil {
+				t.Fatalf("failed to marshal chunk: %v", err)
+			}
+			fmt.Fprintf(w, "data: %s\n\n", data)
+			flusher.Flush()
+		}
+		fmt.Fprint(w, "data: [DONE]\n\n")
+		flusher.Flush()
+	}))
+	defer server.Close()
+
+	llm := newTestModel(t, server)
+	req := &model.LLMRequest{Contents: genai.Text("hi")}
+	seq := llm.GenerateContent(context.Background(), req, true)
+	var finalResp *model.LLMResponse
+	for resp, err := range seq {
+		if err != nil {
+			t.Fatalf("GenerateContent returned error: %v", err)
+		}
+		if !resp.Partial {
+			finalResp = resp
+		}
+	}
+
+	if finalResp == nil {
+		t.Fatal("expected final response")
+	}
+	if got := len(finalResp.Content.Parts); got != 2 {
+		t.Fatalf("expected 2 function calls, got %d", got)
+	}
+	gotCalls := []string{
+		finalResp.Content.Parts[0].FunctionCall.Name,
+		finalResp.Content.Parts[1].FunctionCall.Name,
+	}
+	if diff := cmp.Diff([]string{"tool_a", "tool_b"}, gotCalls); diff != "" {
+		t.Errorf("unexpected tool call names (-want +got):\n%s", diff)
+	}
+	if finalResp.Content.Parts[0].FunctionCall.Args["task"] != "one" {
+		t.Errorf("unexpected args for first tool call: %+v", finalResp.Content.Parts[0].FunctionCall.Args)
+	}
+	if finalResp.Content.Parts[1].FunctionCall.Args["task"] != "two" {
+		t.Errorf("unexpected args for second tool call: %+v", finalResp.Content.Parts[1].FunctionCall.Args)
+	}
+}
+
+func TestModel_GenerateStream_WithSplitToolCallChunks(t *testing.T) {
+	chunks := []map[string]any{
+		{
+			"id": "chatcmpl-split",
+			"choices": []any{
+				map[string]any{
+					"index": 0,
+					"delta": map[string]any{
+						"tool_calls": []any{
+							map[string]any{
+								"index": 0,
+								"id":    "call_weather",
+								"type":  "function",
+								"function": map[string]any{
+									"name":      "get_",
+									"arguments": "",
+								},
+							},
+						},
+					},
+				},
+			},
+		},
+		{
+			"id": "chatcmpl-split",
+			"choices": []any{
+				map[string]any{
+					"index": 0,
+					"delta": map[string]any{
+						"tool_calls": []any{
+							map[string]any{
+								"index": 0,
+								"function": map[string]any{
+									"name":      "weather",
+									"arguments": "",
+								},
+							},
+						},
+					},
+				},
+			},
+		},
+		{
+			"id": "chatcmpl-split",
+			"choices": []any{
+				map[string]any{
+					"index": 0,
+					"delta": map[string]any{
+						"tool_calls": []any{
+							map[string]any{
+								"index": 0,
+								"function": map[string]any{
+									"arguments": `{"loc`,
+								},
+							},
+						},
+					},
+				},
+			},
+		},
+		{
+			"id": "chatcmpl-split",
+			"choices": []any{
+				map[string]any{
+					"index": 0,
+					"delta": map[string]any{
+						"tool_calls": []any{
+							map[string]any{
+								"index": 0,
+								"function": map[string]any{
+									"arguments": `ation":"Boston"}`,
+								},
+							},
+						},
+					},
+				},
+			},
+		},
+		{
+			"id": "chatcmpl-split",
+			"choices": []any{
+				map[string]any{
+					"index":         0,
+					"delta":         map[string]any{},
+					"finish_reason": "tool_calls",
+				},
+			},
+		},
+	}
+
+	server := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
+		w.Header().Set("Content-Type", "text/event-stream")
+		flusher, ok := w.(http.Flusher)
+		if !ok {
+			t.Fatal("expected http.Flusher")
+		}
+		for _, chunk := range chunks {
+			data, err := json.Marshal(chunk)
+			if err != nil {
+				t.Fatalf("failed to marshal chunk: %v", err)
+			}
+			fmt.Fprintf(w, "data: %s\n\n", data)
+			flusher.Flush()
+		}
+		fmt.Fprint(w, "data: [DONE]\n\n")
+		flusher.Flush()
+	}))
+	defer server.Close()
+
+	llm := newTestModel(t, server)
+	req := &model.LLMRequest{Contents: genai.Text("hello")}
+	seq := llm.GenerateContent(context.Background(), req, true)
+	var finalResp *model.LLMResponse
+	for resp, err := range seq {
+		if err != nil {
+			t.Fatalf("GenerateContent returned error: %v", err)
+		}
+		if !resp.Partial {
+			finalResp = resp
+		}
+	}
+
+	if finalResp == nil {
+		t.Fatal("expected final response")
+	}
+	if got := len(finalResp.Content.Parts); got != 1 {
+		t.Fatalf("expected 1 function call, got %d", got)
+	}
+	part := finalResp.Content.Parts[0]
+	if part.FunctionCall == nil {
+		t.Fatalf("expected function call part, got %+v", part)
+	}
+	if part.FunctionCall.Name != "get_weather" {
+		t.Fatalf("unexpected function name: %s", part.FunctionCall.Name)
+	}
+	if part.FunctionCall.Args["location"] != "Boston" {
+		t.Fatalf("unexpected args: %+v", part.FunctionCall.Args)
+	}
+}
+
 func TestModel_GenerateStream(t *testing.T) {
 	tests := []struct {
 		name    string

cpunion avatar Nov 29 '25 12:11 cpunion

I just use codex to compare openai.go and lite_llm.py to try to find the reason, it update some code, and it works on my machine. Can you review the patch below?

@cpunion Damn, we actually made it! Huge thanks — you figured it out brilliantly, and I can’t believe I didn’t think of it earlier. This update also passes your previous tests, and I’ve re-verified both multipletools and loadartifacts on my side.

ChinaSiro avatar Nov 29 '25 16:11 ChinaSiro

I can run the examples successfully, but coverage is still pretty low—GenerateContent/generateStream hover around 55–75% and the helper functions for tool calling (e.g., extractTexts, parseToolCallsFromText) are barely covered. Because the streaming parser and tool-call assembly are custom instead of relying on the official OpenAI SDK, it would be great to add targeted tests for those code paths.

cpunion avatar Dec 01 '25 02:12 cpunion

I can run the examples successfully, but coverage is still pretty low—GenerateContent/generateStream hover around 55–75% and the helper functions for tool calling (e.g., extractTexts, parseToolCallsFromText) are barely covered. Because the streaming parser and tool-call assembly are custom instead of relying on the official OpenAI SDK, it would be great to add targeted tests for those code paths.

Thanks! I've already fixed the issues and updated the tests accordingly.

ChinaSiro avatar Dec 01 '25 12:12 ChinaSiro

@dpasiukevich Please review the current PR again so I can see what needs to be added or adjusted.

ChinaSiro avatar Dec 02 '25 17:12 ChinaSiro