Support Custom Endpoint from Vertex AI Model Garden
I am trying to build a sample app that use Gemma3 custom model deployed on Vertex AI Model Garden. The SDK seems does not provide a way to define custom endpoints like : https://{endpointId}.{location}-{projectId}.prediction.vertexai.goog/v1/projects/{projectId}/locations/{location}/models/endpoints/{endpointId}:generateContent
When I try to use endpoint , the SDK will always include publishers/google in the URL which cause the app to fail with 404 UNIMPLEMENTED error. wrong URL : https://{endpointId}.{location}-{projectId}.prediction.vertexai.goog/v1/projects/{projectId}/locations/{location}/**publishers/google/**models/endpoints/{endpointId}:generateContent
here is my code. same issue occurs when using PredictionServiceClient. `String endpoint = String.format("%s.%s-%s.prediction.vertexai.goog:443", endpointId, location,projectId);
VertexAI vertexAi = new VertexAI.Builder() .setProjectId(projectId) .setLocation(location) .setTransport(Transport.REST) .setApiEndpoint(endpoint) .build();
GenerativeModel model = new GenerativeModel("endpoints/7871795169387872256", vertexAi);`
Hi @gomisbahh, thanks for reporting this issue.
Please note that https://github.com/googleapis/java-genai has been recently released and has more updated support for VertexAI.
Additionally, please note that you can file an issue in our Support Hub if you have a contract with us.
Hi @diegomarquezp , thanks for looking into this. I did try the java-genai library but it is not working with me . I noticed the json payload sent is different than what Vertex AI model (Gemma3) is expecting and also instead of calling the endpoint ending with :predict , it is calling :generateContent
Vertex AI expect this paylod :
{ "instances": [ { "@requestFormat": "chatCompletions", "messages": [ { "role": "user", "content": "Write a short, three-line poem about shared load balancers in GKE." } ], "max_tokens": 100 } ] }
but java-genai is sending such payload:
{ "contents": [ { "parts": [ { "text": "Write a short, three-line poem about shared load balancers in GKE." } ], "role": "user" } ], "generationConfig": { "temperature": 0.7, "maxOutputTokens": 50 } }
so far none of Google official java libraries are working for Vertex AI Gemma3 dedicated endpoint. I will pass your recommendation to the customer regarding creating an issue on support hub.