xgboost-operator icon indicating copy to clipboard operation
xgboost-operator copied to clipboard

Cannot start xgboost-operator container

Open xfate123 opened this issue 5 years ago • 8 comments

I am trying to deploy operator by kustomize. But looks like the pod is not running, and the status is CrashLoopBackOff. Then look into detail by using describe command, I have following message: Last State: Terminated Reason: ContainerCannotRun Message: OCI runtime create failed: container_linux.go:349: starting container process caused "exec: "/root/manager": stat /root/manager: no such file or directory": unknown I did some research online, but didn't get the answer I want. Some people said this is a Windows specific problem, just use kubectl exec -it [pod_name] -c [container_name] -- ./bin/sh. But the thing is the container is failed to be created, so cannot use kubectl exec -it command. Kind of confused. Appreciate any help

xfate123 avatar Jun 16 '20 22:06 xfate123

Issue-Label Bot is automatically applying the labels:

Label Probability
area/operator 0.60
kind/bug 0.57

Please mark this comment with :thumbsup: or :thumbsdown: to give our bot feedback! Links: app homepage, dashboard and code for this bot.

issue-label-bot[bot] avatar Jun 16 '20 22:06 issue-label-bot[bot]

@merlintang @terrytangyuan

xfate123 avatar Jun 16 '20 22:06 xfate123

Im getting the same issue, any ideas @merlintang @terrytangyuan ?

HassanOuda avatar Jun 24 '20 17:06 HassanOuda

Seems like the issue might be that /root/manager was not built successfully or not copied successfully to the next phase https://github.com/kubeflow/xgboost-operator/blob/78f8cf50bb943247e038a8feb5a9f7e47d810d65/Dockerfile#L19. Are you able to build the Docker image successfully?

terrytangyuan avatar Jun 24 '20 17:06 terrytangyuan

@terrytangyuan I tried to use docker to build the image, but the same problem appear. I build by using our latest image in our repository by pull and start command. Still not sure is this issue due to my environment setup or xgboost-operator image. Cause I don't have this issue before

xfate123 avatar Jun 24 '20 17:06 xfate123

Can you show how I can deploy xgboost operator via building docker image and running it vs using the kustomize manifests as shown in the official readme? I'm also getting the same error. I'm deploying it alongside my kubeflow environment in vanilla k8s.

HassanOuda avatar Jun 24 '20 19:06 HassanOuda

@xfate123 can you check the updated version work or not ?

merlintang avatar Jun 25 '20 19:06 merlintang

@merlintang sure,my pleasure

xfate123 avatar Jun 25 '20 19:06 xfate123