UI-TARS-desktop icon indicating copy to clipboard operation
UI-TARS-desktop copied to clipboard

[Feature]: Add Support for Google Gemini API as an LLM Option

Open saadhxak opened this issue 8 months ago • 3 comments

Which destkop app does this feature request relate to?

UI-TARS Desktop

What problem does this feature solve?

Support Gemini API

What does the proposed features look like?

Select Gemini in AI Modle Provider.

BTW maybe put Settings here Image

saadhxak avatar Mar 29 '25 15:03 saadhxak

Thank you for your feedback!Just to clarify, we already have GeminiProvider implemented (apps/agent-tars/src/main/llmProvider/providers/GeminiProvider.ts). However, it is currently not displayed in the user interface. Because of this, the effort required to complete the implementation should be relatively low.

If you're interested, contributions to improve or enhance this feature are always welcome! Feel free to share your thoughts or submit a pull request.

Ref: Implement DeepSeek model provider: https://github.com/bytedance/UI-TARS-desktop/pull/350


Note that even if Gemini provider is supported, the running stability is not guaranteed, see: https://agent-tars.com/doc/quick-start#compare-model-providers

ulivz avatar Mar 29 '25 16:03 ulivz

Thank you for your feedback!Just to clarify, we already have GeminiProvider implemented (apps/agent-tars/src/main/llmProvider/providers/GeminiProvider.ts). However, it is currently not displayed in the user interface. Because of this, the effort required to complete the implementation should be relatively low.

If you're interested, contributions to improve or enhance this feature are always welcome! Feel free to share your thoughts or submit a pull request.

Ref: Implement DeepSeek model provider: #350

Note that even if Gemini provider is supported, the running stability is not guaranteed, see: https://agent-tars.com/doc/quick-start#compare-model-providers

I've made the changes. Happy to submit PR.

jibzus avatar Apr 21 '25 08:04 jibzus

Hi, is there support for Gemini API for UI TARS (not Agent TARS) ?

I am planning to test out computer use via Gemini API (2.5 Flash Preview) via UI TARS on my desktop, and was wondering if that's already possible.

Thank you for the awesome project

jerlyjelly avatar Apr 25 '25 12:04 jerlyjelly