[Feature] Allow easily overriding `command` for head/worker pods
Search before asking
- [x] I had searched in the issues and found no similar feature requirement.
Description
I would like to override just the command part of the head/worker pods of a cluster. Specifically, right now I would like to use bash -c instead of bash -lc, but I could also see situations where I want to use a different shell (e.g. if bash is not present in my image).
However, if I set command: ["bash", "-c"] on the cluster template, I end up with the pod having the specification below, which is not what I want:
Command:
/bin/bash
-lc
--
Args:
bash -c && ulimit -n 65536; ray start --head --dashboard-host=0.0.0.0 --metrics-export-port=8080 --block --dashboard-agent-listen-port=52365 --no-monitor --num-cpus=1 --memory=1073741824
If I manually specify both command and args, this works for the head pod. However, this becomes difficult for workers, because the wait-gcs-ready args are quite complex.
Use case
I have a Ray image based on micromamba, which sets $PATH in login shells (in /etc/profile). Therefore, the default command of bash -lc doesn't work (it cannot find the ray binary). Running a normal shell (with bash -c) works fine.
FWIW, I don't think using a login shell by default is correct, because this isn't an interactive login. But I see it might be difficult to change the default for compatiblity reasons, so just making it easily overridable would be fine.
Related issues
#2208 is related, but is just for the submitter pod of jobs, while I want to do the same for the head & worker pods.
Are you willing to submit a PR?
- [ ] Yes I am willing to submit a PR!
@kwohlfahrt were you able to find a workaround here? I am seeing the exact same thing.
Unfortunately not, we ended up modifying our image to be compatible with Ray.
Same, I ended up adding micromamba's activate current env script to profile.d for now.
RUN cp /usr/local/bin/_activate_current_env.sh /etc/profile.d/