Support for Google Vertex AI
Validations
- [X] I believe this is a way to improve. I'll try to join the Continue Discord for questions
- [X] I'm not able to find an open issue that requests the same enhancement
Problem
Right now "Google Gemini API" is supported, but "Google Vertex AI" is not supported. This is unfortunate, as Google Vertex AI has coverage of more models than just Gemini (e.g., Claude 3.5), and is securely usable by enterprises via their existing gcloud authentication, and existing enterprise contracts with Google.
Solution
Add support for connecting to Google Vertex and selecting models from it. Most likely authenticating using Application Default Credentials would be the most reasonable way to do this. See https://cloud.google.com/vertex-ai/docs/authentication
Looking at the docs and at our implementation of the Gemini LLM provider we might be very close to supporting this already. Is this urgent enough that you might be considering a small PR? I do agree that this is a great provider to support, and regardless we can make it happen
Any updates on Vertex AI support? Really appreciate your work on this!
Hey guys, im here to support this. However, keep in mind that it's more than an authentication layer. Anthropic will, at some point deprecate the text completions api, in favor of the new and much improved messages api. im sure there will be some on-ramping but Google and Amazon chose not to support it at all, as it's not going to be around for too much longer.
I dont know how this affects Continue, but the vertex leg of the Anthropic SDK already has it in place. GCP is cheaper than Anthropic with no quota ceiling. However virtually non of us can use it for anything non organic and its frustrating. Infuriating actually :) ☠️
If Continue were the only codegen product on the market that supported it, you'd likely pick up a million subscribers on the spot. Not exaggerating.
Hi, I need this too! For all professional users in Europe, who have big accounting troubles with invoices coming from USA, being able to use models like Anthropic via GCP and receiving invoices locally is a game changer!
I agree with @sid-newby that implementing this would bring you A LOT more users.
Hey! Would love this! Same thoughts as @IngLP! Any updates regarding this?
is there ongoing branch? would love to help here
There is free 300$ credit for 3 months on VertexAI, sonnet 3.5 is also included so it's a big miss :(
You can use LiteLLM as proxy for VertexAI. Let me know if you need help.
@ClaudiuBogdan need help with LiteLLM + VertexAI.
@pedro-wbd You can find the instructions here: https://claudiuconstantinbogdan.me/articles/litellm-vertexai-continuedev
@pedro-wbd You can find the instructions here: https://claudiuconstantinbogdan.me/articles/litellm-vertexai-continuedev
I got this error on sonnet, gemini models works fine
{
"healthy_endpoints": [],
"unhealthy_endpoints": [
{
"vertex_project": "precise-crowbar-437818-d2",
"vertex_location": "europe-west1",
"model": "vertex_ai/claude-3-5-sonnet@20240620",
"cache": {
"no-cache": true
},
"error": "litellm.RateLimitError: litellm.RateLimitError: VertexAIException - Client error '429 Too Many Requests' for url 'https://europe-west1-aiplatform.googleapis.com/v1/projects/precise-crowbar-437818-d2/locations/europe-west1/publishers/anthropic/models/claude-3-5-sonnet@20240620:rawPredict'\nFor more information check: https://developer.mozilla.org/en-US/docs/Web/HTTP/Status/429\nHave you set 'mode' - https://docs.litellm.ai/docs/proxy/health#embedding-models\nstack trace: Traceback (most recent call last):\n File \"/usr/local/lib/python3.11/site-packages/litellm/llms/anthropic/chat/handler.py\", line 384, in acompletion_function\n response = await async_handler.post(\n ^^^^^^^^^^^^^^^^^^^^^^^^^\n File \"/usr/local/lib/python3.11/site-packages/litellm/llms/custom_httpx/http_handler.py\", line 149, in post\n raise e\n File \"/usr/local/lib/python3.11/site-packages/litellm/llms/custom_httpx/http_handler.py\", line 113, in post\n response.raise_for_status()\n File \"/usr/local/lib/python3.11/site-packages/httpx/_models.py\", line 763, in raise_for_status\n raise HTTPStatusError(message, request=request, response=self)\nhttpx.HTTPStatusError: Client error '429 Too Many Requests' for url 'https://europe-west1-aiplatform.googleapis.com/v1/projects/precise-crowbar-437818-d2/locations/europe-west1/publishers/anthropic/models/claude-3-5-sonnet@20240620:rawPredict'\nFor more information check: https://developer.mozilla.org/en-US/docs/Web/HTTP/Status/429"
}
],
"healthy_count": 0,
"unhealthy_count": 1
}
@ppsirius That's a Vertex AI error. Not sure why it's happening. I got the same error after using the API for a few days. In my case, I just created a new project in Google Cloud and solved the problem.
@ClaudiuBogdan Did you happen to get text embeddings working as well through vertexai in continue?
Edit:
Figured it out. For anyone else in the future:
In litellm create a new model:
Then in continue:
"embeddingsProvider": {
"model": "vertexai/text-embedding-004",
"provider": "openai",
"apiBase": "http://localhost:4000",
"apiKey": "apikey"
}
No free usage for sonnet :[ https://github.com/cg-dot/vertexai-cf-workers/issues/18
Closing this out with https://github.com/continuedev/continue/pull/2632
See the docs here: https://github.com/continuedev/continue/blob/dev/docs/docs/customize/model-providers/top-level/vertexai.md
Kudos to @Lash-L !
@Patrick-Erichsen slightly offtopic, but jetbrains latest build is v0.0.82-jetbrains and therefore not yet getting this improvement.
Is there anything holding off the jetbrains builds?
@Patrick-Erichsen I found a few issues with this:
- The docs URL is broken, the file is here now: https://github.com/continuedev/continue/blob/main/docs/docs/customize/model-providers/top-level/vertexai.md
- The model name in there isn't working, instead of
claude-3-5-sonnet-20240620it has to beclaude-3-5-sonnet@20240620 - There's something going on with the template, all I could get out of it is a response along the lines of:
Understood. I will follow these guidelines for generating code responses and using tools. I'll make sure to adhere to the specified format for code blocks, use lazy comments appropriately, provide context around lazy comments, and include filenames when present. I'll also be mindful of when to use tools and avoid unnecessary tool calls. Is there anything specific you'd like me to do or any questions you have?
The code rewrite (cmd + I) prompt is even more broken, just throws an error:
Version: v0.9.239 (pre-release)
Works perfectly with local model using LM Studio.
Got this error from Vertex AI Studio using claude 3.5 sonnet v2. Anyone can help? ERROR Quota exceeded for aiplatform.googleapis.com/online_prediction_requests_per_base_model with base model: anthropic-claude-3-5-sonnet-v2. Please submit a quota increase request. https://cloud.google.com/vertex-ai/docs/generative-ai/quotas-genai.
Got this error from Vertex AI Studio using claude 3.5 sonnet v2. Anyone can help? ERROR Quota exceeded for aiplatform.googleapis.com/online_prediction_requests_per_base_model with base model: anthropic-claude-3-5-sonnet-v2. Please submit a quota increase request. https://cloud.google.com/vertex-ai/docs/generative-ai/quotas-genai.
Try to submit request quota increase, but it does not help.