Dan Sun
Dan Sun
That is correct, Thanks @mstopa !
unfortunately the current implementation of progressive rollout does not help when `minReplica` is set. For example if you set minReplica to 10 with 1 GPU for each replica then you...
@dprotaso scale-to-zero is not the main reason we use Knative, it is the revision based rollout which is not supported via Deployment. With raw deployment you can't stage traffic and...
@dprotaso I have tested out [initial scale annotation](https://knative.dev/docs/serving/autoscaling/scale-bounds/#initial-scale) it does not mark the revision Ready earlier as the larger of initial scale and lower bound is chosen as the initial...
The custom predictor also supports `CloudEvent`, so you should be able to send in `CloudEvent` format with binary data and this is an example for [avro input data](https://github.com/kserve/kserve/blob/master/python/kserve/test/test_server.py#L500). We are...
Awesome work @Suresh-Nakkeran !! /lgtm /approve
@hutm I am not sure how this is relevant to inference, MPI job eventually trains a model, and then KServe loads it. Are you saying the model is too big...
@hutm do you have an example how you are running multi node inference ? I think the inference graph we are working on for 0.9 release might be able to...
@oyy2000 can you check the inference service yaml status instead of pods? There might be some setup issues why your pods is not coming up
@sindhuvahinis can you show the yaml you deployed with KServe ?