Steve Han

Results 19 comments of Steve Han

Thanks for the ideas! As long as the conditions field still surfaces the resource quota error, it should be fine. We can read it from the status.

So we've actually been running this in production for quite a while. The only caveat is that the notebook controller cannot directly talk to notebooks in user namespaces to update...

Sure. ``` apiVersion: networking.istio.io/v1beta1 kind: Sidecar metadata: name: istio-sidecar-prune-egress namespace: istio-system spec: egress: - hosts: - ./* - ingress-nginx/* - ingress-nginx-serving/* - istio-system/* - kubeflow/* - kube-system/* - argocd/* -...

We are having the same problem. We are on 1.32 and are trying to upgrade AL2 to AL2023. Changing the userData in karpenter EC2NodeClass like you suggested fixed the issue:...

Update: replacing the containerd pause image doesn't fix the issue consistently. We found that using iptables instead of nftables for kubeproxy fixed the issue entirely. I don't think this issue...

Thank you so much for this PR! This would solve my issue. Is there anything else blocking the PR from merging? Should we get rid of the pill and only...

@koonweee thanks again for the PR! I don't seem to have permissions to resolve the conflicts on your branch - how can I help?

> Documentation and Terraform for the reference architecture > For example, GPU scheduling, reduce image pulling overhead, logging, notifications, ... etc Interested in how notification should work. We are currently...

Just want to add that this can not only reduce disk size requirement, but can also improve loading time by pipelining the process. Here’s a similar project: https://github.com/run-ai/runai-model-streamer