google-cloud-java icon indicating copy to clipboard operation
google-cloud-java copied to clipboard

Fine tuned model endpoint cannot be invoked due to hardcoded resource name

Open manocha-aman opened this issue 1 year ago • 2 comments

I am facing this issue with Java SDK(google-cloud-vertexai:1.2.0) but this issue may exist for other languages as well.
GenerativeModel creates endpoint (resource) from passed model name : new GenerativeModel.Builder().setModelName("models/$MODEL_NAME") Resource name is computed as: this.resourceName = String.format( "projects/%s/locations/%s/publishers/google/models/%s", vertexAi.getProjectId(), vertexAi.getLocation(), modelName); This works fine for existing models but not for fine-tuned models.

The endpoint that works is: https://us-central1-aiplatform.googleapis.com/v1/projects/$PROJECT_ID/locations/$LOCATION_ID/enpoints/$ENDPOINT_ID:generateContent

I think fixing resourceName computation should resolve the issue.

Here is the stack-trace:

Exception in thread "main" com.google.api.gax.rpc.InvalidArgumentException: io.grpc.StatusRuntimeException: INVALID_ARGUMENT: Request contains an invalid argument. at com.google.api.gax.rpc.ApiExceptionFactory.createException(ApiExceptionFactory.java:92) at com.google.api.gax.rpc.ApiExceptionFactory.createException(ApiExceptionFactory.java:41) at com.google.api.gax.grpc.GrpcApiExceptionFactory.create(GrpcApiExceptionFactory.java:86) at com.google.api.gax.grpc.GrpcApiExceptionFactory.create(GrpcApiExceptionFactory.java:66) at com.google.api.gax.grpc.GrpcExceptionCallable$ExceptionTransformingFuture.onFailure(GrpcExceptionCallable.java:97) at com.google.api.core.ApiFutures$1.onFailure(ApiFutures.java:84) at com.google.common.util.concurrent.Futures$CallbackListener.run(Futures.java:1130) at com.google.common.util.concurrent.DirectExecutor.execute(DirectExecutor.java:31) at com.google.common.util.concurrent.AbstractFuture.executeListener(AbstractFuture.java:1298) at com.google.common.util.concurrent.AbstractFuture.complete(AbstractFuture.java:1059) at com.google.common.util.concurrent.AbstractFuture.setException(AbstractFuture.java:809) at io.grpc.stub.ClientCalls$GrpcFuture.setException(ClientCalls.java:568) at io.grpc.stub.ClientCalls$UnaryStreamToFuture.onClose(ClientCalls.java:538) at io.grpc.PartialForwardingClientCallListener.onClose(PartialForwardingClientCallListener.java:39) at io.grpc.ForwardingClientCallListener.onClose(ForwardingClientCallListener.java:23) at io.grpc.ForwardingClientCallListener$SimpleForwardingClientCallListener.onClose(ForwardingClientCallListener.java:40) at com.google.api.gax.grpc.ChannelPool$ReleasingClientCall$1.onClose(ChannelPool.java:570) at io.grpc.internal.DelayedClientCall$DelayedListener$3.run(DelayedClientCall.java:489) at io.grpc.internal.DelayedClientCall$DelayedListener.delayOrExecute(DelayedClientCall.java:453) at io.grpc.internal.DelayedClientCall$DelayedListener.onClose(DelayedClientCall.java:486) at io.grpc.internal.ClientCallImpl.closeObserver(ClientCallImpl.java:574) at io.grpc.internal.ClientCallImpl.access$300(ClientCallImpl.java:72) at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1StreamClosed.runInternal(ClientCallImpl.java:742) at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1StreamClosed.runInContext(ClientCallImpl.java:723) at io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37) at io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:133) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) at java.base/java.lang.Thread.run(Thread.java:833) Suppressed: com.google.api.gax.rpc.AsyncTaskException: Asynchronous task failed at com.google.api.gax.rpc.ApiExceptions.callAndTranslateApiException(ApiExceptions.java:57) at com.google.api.gax.rpc.UnaryCallable.call(UnaryCallable.java:112) at com.google.cloud.vertexai.generativeai.GenerativeModel.generateContent(GenerativeModel.java:329) at com.google.cloud.vertexai.generativeai.GenerativeModel.generateContent(GenerativeModel.java:316) at com.google.cloud.vertexai.generativeai.ChatSession.sendMessage(ChatSession.java:160) at com.google.cloud.vertexai.generativeai.ChatSession.sendMessage(ChatSession.java:148)

manocha-aman avatar May 04 '24 13:05 manocha-aman

cc/ @ZhenyiQ

mpeddada1 avatar May 06 '24 20:05 mpeddada1

Thanks @manocha-aman for raising this issue! We didn't intend to support fine-tuned models initially but will fix this and allow queries to be sent to different resource ids. Stay tuned :) (will keep this open until we submit our change)

ZhenyiQ avatar May 06 '24 20:05 ZhenyiQ

will keep this open until we submit our change

Keeping this open.

suztomo avatar May 17 '24 14:05 suztomo

Hi @manocha-aman , this should've been supported by https://github.com/googleapis/google-cloud-java/pull/10825.

The model name needs to be projects/project_number/locations/location/endpoints/model_id for this to work. We will add documentation for this soon.

ZhenyiQ avatar May 22 '24 15:05 ZhenyiQ