kubeflow
kubeflow copied to clipboard
Kubeflow minio pod showing "Error: disk path full"
/kind question
Question: Kubeflow minio pod showing "Error: disk path full". Do we have any purging strategy to handle pipeline artifact deletion.
Details: POD - [root@master ~]# kubectl get po -n kubeflow | grep minio minio-5579d6498b-87r7m 2/2 Running 2 6d [root@master ~]#
wait container logs:
time="2022-07-11T06:02:23.468Z" level=info msg="Saving from /tmp/argo/outputs/logs/main.log to s3 (endpoint: minio-service.kubeflow:9000, bucket: mlpipeline, key: artifacts/XXXX-pipeline-6x54b/XXXX-pipeline-6x54b-2106946359/main.log)" time="2022-07-11T06:02:23.471Z" level=warning msg="Failed to put file: Storage backend has reached its minimum free disk threshold. Please delete a few objects to proceed." time="2022-07-11T06:02:39.821Z" level=info msg="S3 Save path: /tmp/argo/outputs/logs/main.log, key: artifacts/XXXX-pipeline-6x54b/XXXX-pipeline-6x54b-2106946359/main.log" time="2022-07-11T06:02:39.821Z" level=info msg="Creating minio client minio-service.kubeflow:9000 using static credentials" time="2022-07-11T06:02:39.821Z" level=info msg="Saving from /tmp/argo/outputs/logs/main.log to s3 (endpoint: minio-service.kubeflow:9000, bucket: mlpipeline, key: artifacts/XXXX-pipeline-6x54b/XXXX-pipeline-6x54b-2106946359/main.log)" time="2022-07-11T06:02:39.827Z" level=warning msg="Failed to put file: Storage backend has reached its minimum free disk threshold. Please delete a few objects to proceed." time="2022-07-11T06:02:39.827Z" level=error msg="executor error: timed out waiting for the condition" time="2022-07-11T06:02:39.827Z" level=info msg="Killing sidecars" time="2022-07-11T06:02:39.831Z" level=info msg="Killing sidecar echo (25c2c53996b4f1facef84c76106e76cf5ec2caef53a713463da8f513165d654e)" time="2022-07-11T06:02:39.831Z" level=info msg="Killing containers [25c2c53996b4f1facef84c76106e76cf5ec2caef53a713463da8f513165d654e]" time="2022-07-11T06:02:39.831Z" level=info msg="SIGTERM containerID "25c2c53996b4f1facef84c76106e76cf5ec2caef53a713463da8f513165d654e": terminated" time="2022-07-11T06:02:39.834Z" level=info msg="https://172.30.0.1:443/api/v1/namespaces/kubeflow/pods/XXXX-pipeline-6x54b-2106946359/exec?command=%2Fbin%2Fsh&command=-c&command=kill+-15+1&container=echo&stderr=true&stdout=false&tty=false" time="2022-07-11T06:03:09.858Z" level=error msg="executor error: Timeout occurred\ngithub.com/argoproj/argo/v2/errors.Wrap\n\t/go/src/github.com/argoproj/argo/errors/errors.go:88\ngithub.com/argoproj/argo/v2/errors.InternalWrapError\n\t/go/src/github.com/argoproj/argo/errors/errors.go:71\ngithub.com/argoproj/argo/v2/workflow/common.GetExecutorOutput\n\t/go/src/github.com/argoproj/argo/workflow/common/util.go:240\ngithub.com/argoproj/argo/v2/workflow/executor/k8sapi.(*k8sAPIClient).KillContainer\n\t/go/src/github.com/argoproj/argo/workflow/executor/k8sapi/client.go:87\ngithub.com/argoproj/argo/v2/workflow/executor/common.TerminatePodWithContainerID\n\t/go/src/github.com/argoproj/argo/workflow/executor/common/common.go:92\ngithub.com/argoproj/argo/v2/workflow/executor/common.KillGracefully\n\t/go/src/github.com/argoproj/argo/workflow/executor/common/common.go:98\ngithub.com/argoproj/argo/v2/workflow/executor/k8sapi.(*k8sAPIClient).killGracefully\n\t/go/src/github.com/argoproj/argo/workflow/executor/k8sapi/client.go:92\ngithub.com/argoproj/argo/v2/workflow/executor/k8sapi.(*K8sAPIExecutor).Kill\n\t/go/src/github.com/argoproj/argo/workflow/executor/k8sapi/k8sapi.go:71\ngithub.com/argoproj/argo/v2/workflow/executor.(*WorkflowExecutor).KillSidecars\n\t/go/src/github.com/argoproj/argo/workflow/executor/executor.go:1268\ngithub.com/argoproj/argo/v2/cmd/argoexec/commands.waitContainer.func1\n\t/go/src/github.com/argoproj/argo/cmd/argoexec/commands/wait.go:33\ngithub.com/argoproj/argo/v2/cmd/argoexec/commands.waitContainer\n\t/go/src/github.com/argoproj/argo/cmd/argoexec/commands/wait.go:62\ngithub.com/argoproj/argo/v2/cmd/argoexec/commands.NewWaitCommand.func1\n\t/go/src/github.com/argoproj/argo/cmd/argoexec/commands/wait.go:16\ngithub.com/spf13/cobra.(*Command).execute\n\t/go/pkg/mod/github.com/spf13/[email protected]/command.go:846\ngithub.com/spf13/cobra.(*Command).ExecuteC\n\t/go/pkg/mod/github.com/spf13/[email protected]/command.go:950\ngithub.com/spf13/cobra.(*Command).Execute\n\t/go/pkg/mod/github.com/spf13/[email protected]/command.go:887\nmain.main\n\t/go/src/github.com/argoproj/argo/cmd/argoexec/main.go:13\nruntime.main\n\t/usr/local/go/src/runtime/proc.go:203\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1357" time="2022-07-11T06:03:09.858Z" level=info msg="Alloc=6961 TotalAlloc=20663 Sys=71360 NumGC=7 Goroutines=15" time="2022-07-11T06:03:09.880Z" level=fatal msg="timed out waiting for the condition"
minio pod logs:
API: PutObject(bucket=mlpipeline, object=artifacts/XXXX-pipeline-6x54b/XXXX-pipeline-6x54b-2106946359/main.log) Time: 06:02:23 UTC 07/11/2022 DeploymentID: 97f93caf-2d5d-4b5c-bbe6-a8aa0be1be70 RequestID: 1700B10F9BED7235 RemoteHost: 10.128.0.213 Host: minio-service.kubeflow:9000 UserAgent: MinIO (linux; amd64) minio-go/v7.0.2 Error: disk path full 5: cmd/fs-v1-helpers.go:320:cmd.fsCreateFile() 4: cmd/fs-v1.go:903:cmd.(*FSObjects).putObject() 3: cmd/fs-v1.go:817:cmd.(*FSObjects).PutObject() 2: cmd/object-handlers.go:1255:cmd.objectAPIHandlers.PutObjectHandler() 1: net/http/server.go:1995:http.HandlerFunc.ServeHTTP()
API: PutObject(bucket=mlpipeline, object=artifacts/XXXX-pipeline-6x54b/XXXX-pipeline-6x54b-2106946359/main.log) Time: 06:02:39 UTC 07/11/2022 DeploymentID: 97f93caf-2d5d-4b5c-bbe6-a8aa0be1be70 RequestID: 1700B1136AC65A77 RemoteHost: 10.128.0.213 Host: minio-service.kubeflow:9000 UserAgent: MinIO (linux; amd64) minio-go/v7.0.2 Error: disk path full 5: cmd/fs-v1-helpers.go:320:cmd.fsCreateFile() 4: cmd/fs-v1.go:903:cmd.(*FSObjects).putObject() 3: cmd/fs-v1.go:817:cmd.(*FSObjects).PutObject() 2: cmd/object-handlers.go:1255:cmd.objectAPIHandlers.PutObjectHandler() 1: net/http/server.go:1995:http.HandlerFunc.ServeHTTP()
@shubamsharma, think you might need to check https://docs.arrikto.com/ops/kubeflow/minio.html. Hope this helps
@shan100github, Thanks for the reply.
But above information is more sort of manual approach. Is their any way to enable purging of pipeline artifacts.
/close
There has been no activity for a long time. Please reopen if necessary.
@juliusvonkohout: Closing this issue.
In response to this:
/close
There has been no activity for a long time. Please reopen if necessary.
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.