local-path-provisioner icon indicating copy to clipboard operation
local-path-provisioner copied to clipboard

create process timeout after 120 seconds

Open meyerbro opened this issue 6 years ago • 16 comments

Followed the tutorial not changing a single thing...

Got this in the logs: create process timeout after 120 seconds

And it keeps trying and trying to create a pvc volume...

I can see that there's no pv created too...

Do I need to create the folder /opt/local-path-provisioner manually? (even doing that it doesn't work)

Did anyone have the same issue?

Thanks!

meyerbro avatar Aug 14 '19 16:08 meyerbro

@meyerbro Can you provide with the log with the local path provisioner pod? You shouldn't need to create /opt/local-path-provisioner manually.

yasker avatar Aug 14 '19 18:08 yasker

Sure @yasker, here it is:

time="2019-08-14T16:14:34Z" level=debug msg="Applied config: {\"nodePathMap\":[{\"node\":\"DEFAULT_PATH_FOR_NON_LISTED_NODES\",\"paths\":[\"/opt/local-path-provisioner\"]}]}"
time="2019-08-14T16:14:34Z" level=debug msg="Provisioner started"
time="2019-08-14T16:14:51Z" level=debug msg="config doesn't contain node shared-jenk-24.sandbox.local, use DEFAULT_PATH_FOR_NON_LISTED_NODES instead"
time="2019-08-14T16:14:51Z" level=info msg="Creating volume pvc-a32c81ba-beae-11e9-8212-002590fb4cd4 at shared-jenk-24.sandbox.local:/opt/local-path-provisioner/pvc-a32c81ba-beae-11e9-8212-002590fb4cd4"
E0814 16:16:52.202355       1 controller.go:701] error syncing claim "default/local-path-pvc": failed to provision volume with StorageClass "local-path": failed to create volume pvc-a32c81ba-beae-11e9-8212-002590fb4cd4: create process timeout after 120 seconds
time="2019-08-14T16:17:07Z" level=debug msg="config doesn't contain node shared-jenk-24.sandbox.local, use DEFAULT_PATH_FOR_NON_LISTED_NODES instead"
time="2019-08-14T16:17:07Z" level=info msg="Creating volume pvc-a32c81ba-beae-11e9-8212-002590fb4cd4 at shared-jenk-24.sandbox.local:/opt/local-path-provisioner/pvc-a32c81ba-beae-11e9-8212-002590fb4cd4"
E0814 16:19:07.535033       1 controller.go:701] error syncing claim "default/local-path-pvc": failed to provision volume with StorageClass "local-path": failed to create volume pvc-a32c81ba-beae-11e9-8212-002590fb4cd4: create process timeout after 120 seconds
time="2019-08-14T16:19:37Z" level=debug msg="config doesn't contain node shared-jenk-24.sandbox.local, use DEFAULT_PATH_FOR_NON_LISTED_NODES instead"
time="2019-08-14T16:19:37Z" level=info msg="Creating volume pvc-a32c81ba-beae-11e9-8212-002590fb4cd4 at shared-jenk-24.sandbox.local:/opt/local-path-provisioner/pvc-a32c81ba-beae-11e9-8212-002590fb4cd4"
E0814 16:21:37.876760       1 controller.go:701] error syncing claim "default/local-path-pvc": failed to provision volume with StorageClass "local-path": failed to create volume pvc-a32c81ba-beae-11e9-8212-002590fb4cd4: create process timeout after 120 seconds

meyerbro avatar Aug 15 '19 09:08 meyerbro

@meyerbro The creation process will create a helper pod on the host to help with setup the directory. It seems the pod doesn't finish after 120 seconds. Can you check why the helper pod stuck there? The name of the pod is create-<volumename>

yasker avatar Aug 15 '19 14:08 yasker

Hi @yasker, just checking that, currently no logs yet... Will wait the full 120 seconds and update here.

meyerbro avatar Aug 15 '19 15:08 meyerbro

After the 120 seconds the only that happened was that the container was terminated and after some seconds a new one with no logs is created...

meyerbro avatar Aug 15 '19 15:08 meyerbro

# docker ps | grep create
a0ac37b5d22a        rancher/pause:3.1                                 "/pause"                 4 seconds ago        Up 2 seconds                            k8s_POD_create-pvc-0de20954-bf73-11e9-8212-002590fb4cd4_local-path-storage_615625fe-bf73-11e9-8212-002590fb4cd4_0
# docker logs -f a0ac37b5d22a
Shutting down, got signal: Terminated

meyerbro avatar Aug 15 '19 15:08 meyerbro

I decided to give a try on hostpath from rimusz-charts to see if it's something wrong only on my cluster but it actually worked fine... I would prefer using yours as we have RKE here...

meyerbro avatar Aug 15 '19 15:08 meyerbro

@meyerbro When you see the create- pod started, it will change to running later? or is it always in pending state? Can you try kubectl describe for the pod? Something stopped the helper pod from working.

yasker avatar Aug 15 '19 16:08 yasker

Not sure if the root cause is the same as for @meyerbro , but I am seeing similar issues on CentOS Linux release 8.1.1911 (Core) with SELinux set to Enforcing and it looks like the local path directory's creation is blocked:

type=AVC msg=audit(1590840690.471:714): avc:  denied  { create } for  pid=17975 comm="mkdir" name="pvc-f349b28b-12cd-42c1-ad7b-f1bf018f9171" scontext=system_u:system_r:container_t:s0:c228,c866 tcontext=system_u:object_r:container_var_lib_t:s0 tclass=dir permissive=0
type=SYSCALL msg=audit(1590840690.471:714): arch=c000003e syscall=83 per=400000 success=no exit=-13 a0=7ffd8cb6676d a1=1ff a2=7ffd8cb6679b a3=1ff items=0 ppid=17862 pid=17975 auid=4294967295 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=(none) ses=4294967295 comm="mkdir" exe="/bin/mkdir" subj=system_u:system_r:container_t:s0:c228,c866 key=(null)^]ARCH=x86_64 SYSCALL=mkdir AUID="unset" UID="root" GID="root" EUID="root" SUID="root" FSUID="root" EGID="root" SGID="root" FSGID="root"

After setenforce 0 to disable SELinux temporarily, the problem is gone, the PVCs are bound.

cruizer avatar May 30 '20 12:05 cruizer

Hello @yasker, should I open a separate issue for my SELinux related root cause in https://github.com/rancher/k3s-selinux ? The SELinux policy RPM is installed by the way.

Thanks a bunch.

cruizer avatar Jun 02 '20 16:06 cruizer

@cruizer Yes, can you create a separate issue? It's clear that your issue is caused by SELinux.

yasker avatar Jun 02 '20 22:06 yasker

@cruizer Yes, can you create a separate issue? It's clear that your issue is caused by SELinux.

OK, done, I have opened: https://github.com/rancher/k3s-selinux/issues/9

Thank you!

cruizer avatar Jun 04 '20 13:06 cruizer

Hi @yasker , Same issue is happening for me as well. Is it because of the multi-node cluster?

I have used terraform/RKE to set up the cluster with 2 nodes.

ClenchPaign avatar Mar 25 '22 07:03 ClenchPaign

Yes seeing the same for RKE2 cluster for backup-operator-pvc @cruizer @yasker

Warning ProvisioningFailed persistentvolumeclaim/backup-operator-pvc failed to provision volume with StorageClass "local-path": failed to create volume pvc-a1a56725-86c9-49a3-9d8b-cd5a9b8f67d7: create process timeout after 120 seconds

DivyaKhatnaar avatar Mar 27 '22 23:03 DivyaKhatnaar

This works now when using version 0.0.21 instead of master. Solution: https://stackoverflow.com/a/71605241/18577365

ClenchPaign avatar Mar 28 '22 04:03 ClenchPaign

@ClenchPaign The error in the stackoverflow ticket is fixed. You can try the master branch as well, but 0.0.21 is the latest stable version.

derekbit avatar Mar 28 '22 04:03 derekbit