continue Support for Google Vertex AI

Validations

[X] I believe this is a way to improve. I'll try to join the Continue Discord for questions
[X] I'm not able to find an open issue that requests the same enhancement

Problem

Right now "Google Gemini API" is supported, but "Google Vertex AI" is not supported. This is unfortunate, as Google Vertex AI has coverage of more models than just Gemini (e.g., Claude 3.5), and is securely usable by enterprises via their existing gcloud authentication, and existing enterprise contracts with Google.

Solution

Add support for connecting to Google Vertex and selecting models from it. Most likely authenticating using Application Default Credentials would be the most reasonable way to do this. See https://cloud.google.com/vertex-ai/docs/authentication

Jun 26 '24 12:06 djahandarie

Looking at the docs and at our implementation of the Gemini LLM provider we might be very close to supporting this already. Is this urgent enough that you might be considering a small PR? I do agree that this is a great provider to support, and regardless we can make it happen

Jun 28 '24 21:06 sestinj

Any updates on Vertex AI support? Really appreciate your work on this!

Jul 16 '24 03:07 giranntu

Hey guys, im here to support this. However, keep in mind that it's more than an authentication layer. Anthropic will, at some point deprecate the text completions api, in favor of the new and much improved messages api. im sure there will be some on-ramping but Google and Amazon chose not to support it at all, as it's not going to be around for too much longer.

I dont know how this affects Continue, but the vertex leg of the Anthropic SDK already has it in place. GCP is cheaper than Anthropic with no quota ceiling. However virtually non of us can use it for anything non organic and its frustrating. Infuriating actually :) ☠️

If Continue were the only codegen product on the market that supported it, you'd likely pick up a million subscribers on the spot. Not exaggerating.

Jul 18 '24 06:07 sid-newby

Hi, I need this too! For all professional users in Europe, who have big accounting troubles with invoices coming from USA, being able to use models like Anthropic via GCP and receiving invoices locally is a game changer!

I agree with @sid-newby that implementing this would bring you A LOT more users.

Jul 26 '24 08:07 IngLP

Hey! Would love this! Same thoughts as @IngLP! Any updates regarding this?

Sep 18 '24 20:09 enkiid

is there ongoing branch? would love to help here

Sep 21 '24 23:09 bungrudi

There is free 300$ credit for 3 months on VertexAI, sonnet 3.5 is also included so it's a big miss :(

Oct 08 '24 19:10 ppsirius

You can use LiteLLM as proxy for VertexAI. Let me know if you need help.

Oct 11 '24 14:10 ClaudiuBogdan

@ClaudiuBogdan need help with LiteLLM + VertexAI.

Oct 13 '24 21:10 pedro-wbd

@pedro-wbd You can find the instructions here: https://claudiuconstantinbogdan.me/articles/litellm-vertexai-continuedev

Oct 14 '24 07:10 ClaudiuBogdan

@pedro-wbd You can find the instructions here: https://claudiuconstantinbogdan.me/articles/litellm-vertexai-continuedev

I got this error on sonnet, gemini models works fine

{
  "healthy_endpoints": [],
  "unhealthy_endpoints": [
    {
      "vertex_project": "precise-crowbar-437818-d2",
      "vertex_location": "europe-west1",
      "model": "vertex_ai/claude-3-5-sonnet@20240620",
      "cache": {
        "no-cache": true
      },
      "error": "litellm.RateLimitError: litellm.RateLimitError: VertexAIException - Client error '429 Too Many Requests' for url 'https://europe-west1-aiplatform.googleapis.com/v1/projects/precise-crowbar-437818-d2/locations/europe-west1/publishers/anthropic/models/claude-3-5-sonnet@20240620:rawPredict'\nFor more information check: https://developer.mozilla.org/en-US/docs/Web/HTTP/Status/429\nHave you set 'mode' - https://docs.litellm.ai/docs/proxy/health#embedding-models\nstack trace: Traceback (most recent call last):\n  File \"/usr/local/lib/python3.11/site-packages/litellm/llms/anthropic/chat/handler.py\", line 384, in acompletion_function\n    response = await async_handler.post(\n               ^^^^^^^^^^^^^^^^^^^^^^^^^\n  File \"/usr/local/lib/python3.11/site-packages/litellm/llms/custom_httpx/http_handler.py\", line 149, in post\n    raise e\n  File \"/usr/local/lib/python3.11/site-packages/litellm/llms/custom_httpx/http_handler.py\", line 113, in post\n    response.raise_for_status()\n  File \"/usr/local/lib/python3.11/site-packages/httpx/_models.py\", line 763, in raise_for_status\n    raise HTTPStatusError(message, request=request, response=self)\nhttpx.HTTPStatusError: Client error '429 Too Many Requests' for url 'https://europe-west1-aiplatform.googleapis.com/v1/projects/precise-crowbar-437818-d2/locations/europe-west1/publishers/anthropic/models/claude-3-5-sonnet@20240620:rawPredict'\nFor more information check: https://developer.mozilla.org/en-US/docs/Web/HTTP/Status/429"
    }
  ],
  "healthy_count": 0,
  "unhealthy_count": 1
}

Oct 14 '24 09:10 ppsirius

@ppsirius That's a Vertex AI error. Not sure why it's happening. I got the same error after using the API for a few days. In my case, I just created a new project in Google Cloud and solved the problem.

Oct 14 '24 09:10 ClaudiuBogdan

@ClaudiuBogdan Did you happen to get text embeddings working as well through vertexai in continue?

Edit: Figured it out. For anyone else in the future: In litellm create a new model: Then in continue:

  "embeddingsProvider": {
      "model": "vertexai/text-embedding-004",
      "provider": "openai",
      "apiBase": "http://localhost:4000",
      "apiKey": "apikey"
    }

Oct 16 '24 18:10 Lash-L

No free usage for sonnet :[ https://github.com/cg-dot/vertexai-cf-workers/issues/18

Oct 18 '24 12:10 ppsirius

Closing this out with https://github.com/continuedev/continue/pull/2632

See the docs here: https://github.com/continuedev/continue/blob/dev/docs/docs/customize/model-providers/top-level/vertexai.md

Kudos to @Lash-L !

Oct 31 '24 04:10 Patrick-Erichsen

@Patrick-Erichsen slightly offtopic, but jetbrains latest build is v0.0.82-jetbrains and therefore not yet getting this improvement.

Is there anything holding off the jetbrains builds?

Nov 28 '24 12:11 luminoso

@Patrick-Erichsen I found a few issues with this:

The docs URL is broken, the file is here now: https://github.com/continuedev/continue/blob/main/docs/docs/customize/model-providers/top-level/vertexai.md
The model name in there isn't working, instead of claude-3-5-sonnet-20240620 it has to be claude-3-5-sonnet@20240620
There's something going on with the template, all I could get out of it is a response along the lines of:

Understood. I will follow these guidelines for generating code responses and using tools. I'll make sure to adhere to the specified format for code blocks, use lazy comments appropriately, provide context around lazy comments, and include filenames when present. I'll also be mindful of when to use tools and avoid unnecessary tool calls. Is there anything specific you'd like me to do or any questions you have?

Dec 05 '24 17:12 daaain

The code rewrite (cmd + I) prompt is even more broken, just throws an error:

Version: v0.9.239 (pre-release)

Works perfectly with local model using LM Studio.

Dec 06 '24 17:12 daaain

Got this error from Vertex AI Studio using claude 3.5 sonnet v2. Anyone can help? ERROR Quota exceeded for aiplatform.googleapis.com/online_prediction_requests_per_base_model with base model: anthropic-claude-3-5-sonnet-v2. Please submit a quota increase request. https://cloud.google.com/vertex-ai/docs/generative-ai/quotas-genai.

Jan 22 '25 07:01 sakchart

Got this error from Vertex AI Studio using claude 3.5 sonnet v2. Anyone can help? ERROR Quota exceeded for aiplatform.googleapis.com/online_prediction_requests_per_base_model with base model: anthropic-claude-3-5-sonnet-v2. Please submit a quota increase request. https://cloud.google.com/vertex-ai/docs/generative-ai/quotas-genai.

Try to submit request quota increase, but it does not help.

Jan 22 '25 07:01 sakchart