Permission denied when reading TrainJob function script when run as non-root user
What happened?
Creating a TrainJob on cluster with pod security admission configured to run containers as non-root:
from kubeflow.training import TrainingClient, Trainer
def train_func():
pass
job_name = TrainingClient().train(
runtime_ref="torch-distributed",
trainer=Trainer(func=train_func),
)
Fails with permissions denied to read the train function script.
What did you expect to happen?
Running TrainJob as root should not be required. and the TrainJob should succeed when run as non-root.
Environment
Kubernetes version:
$ kubectl version
Client Version: v1.31.1
Kustomize Version: v5.4.2
Server Version: v1.27.11+ec42b99
Training Operator Python SDK version:
$ pip show kubeflow-training
Version: 2.0.0
Impacted by this bug?
Give it a 👍 We prioritize the issues with most 👍
Thanks for creating this @astefanutti!
/remove-label lifecycle/needs-triage /area sdk
/assign @astefanutti since PR is ready for review. cc @andreyvelich
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
This issue has been automatically closed because it has not had recent activity. Please comment "/reopen" to reopen it.