Jiaxin Shan
Jiaxin Shan
related paper: https://arxiv.org/abs/2404.14527
We do not have plan in v0.2.0 to change the orchestration part. Let's firstly resolve the cost-efficient serving issue using multiple deployment with some common labels, that's enough. I will...
this is a sub-story of #425, we may use a lose way like labels to orchestrate the workload in v0.2.0. We can better orchestrate such workloads in v0.3.0 with model...
/cc @M00nF1sh are you aware of any tools to covert kustomize to helm package? We do not want to maintain the helm separately
same here. it only happens on lambda instance + nvkind
@M00nF1sh Can we add the helm repo first? Let's have a short discussion on the kustomize manifests maintenance later.
 kustomization and generate yaml should be good enough for v0.1.0 release. Helm package support can be postponed to v0.2.0
 the problem still exist.
Actually most of the containers crashed. metadata-service   gpu-optimizer   gateway-plugin   redis-master   controller-manager  
three categories - solid softwares like redis/controller/gateway-plugin, exitCode is 0. they all have error handling - our own written compinents, like gpu-optimizer, metadata service shows other error codes. - kuberay...