Edgar Hernández

Results 97 comments of Edgar Hernández

In the main/original comment, I see a mention of short lived tokens that need to be refreshed, because of a secured Prometheus where requests needs to be authenticated. I also...

@nirmesh May you re-upload the image of the pod?

> Suppose many infereceservices use a single servingRuntime and you change the servingRuntime, it will restart a lot of pods at the same time. It will require double the resources...

Personally, I think the current behavior is buggy. I know it is possible to workaround the issue by touching the ISVC or by restarting the controller, but it is not...

The custom tensorflow image, is it somehow based on any of the KServe images? Or did you build by yourself?

> I used the image from the AWS ECR public repo here: https://github.com/aws/deep-learning-containers/blob/master/available_images.md What I can tell is that the image/container needs to expose the metrics. KServe, by itself, cannot...

> this is a kind of conversion webhook template and afaik, there is a plan to upgrade api version near future so I think it is okay to have the...

> How can I scale down the number of InferenceService to zero? I see your InferenceService has annotation `serving.kserve.io/deploymentMode: RawDeployment`. Scaling down to zero is not available in `RawDeployment` mode....

BTW, if what you want is simply to downscale an InferenceService to zero replicas (similarly to how a standard `Deployment` can be configured to zero replicas), currently that's not possible....

I may think manually managing the replicas is, currently, unsupported. Is this right @yuzisun ?