security-profiles-operator icon indicating copy to clipboard operation
security-profiles-operator copied to clipboard

The spod daemonset can't work,the log is that fatal error: timer when must be positive.

Open neblen opened this issue 3 years ago • 22 comments

What happened:

The spod daemonset can't work.I deploy spo0.4.1 on ubuntu k8s,but the spod can't work correctly.

What you expected to happen:

The spod daemonset work correctly.

How to reproduce it (as minimally and precisely as possible):

Anything else we need to know?:

The error log is as follows.

I0418 05:30:48.045731 1 logr.go:252] "msg"="Set logging verbosity to 0"
I0418 05:30:48.045731 1 logr.go:252] "msg"="Profiling support enabled: false"
fatal error: timer when must be positive
goroutine 4 [running]:
runtime.throw({0x1b3a37a, 0x0})
runtime/panic.go:1198 +0x71 fp=0xc000059ee8 sp=0xc000059eb8 pc=0x438b31
runtime.modtimer(0xb418cff40fe67fd0, 0xc00004a000, 0xc00004a000, 0x1, {0x0, 0x0}, 0x1a43301)
runtime/time.go:425 +0x4a7 fp=0xc000059f30 sp=0xc000059ee8 pc=0x458407
runtime.resettimer(...)
runtime/time.go:539
time.resetTimer(...)
runtime/time.go:229
runtime.scavengeSleep(0x1e8480)
runtime/mgcscavenge.go:237 +0x73 fp=0xc000059f80 sp=0xc000059f30 pc=0x423a53
runtime.bgscavenge()
runtime/mgcscavenge.go:366 +0x205 fp=0xc000059fe0 sp=0xc000059f80 pc=0x423cc5
runtime.goexit()
runtime/asm_amd64.s:1581 +0x1 fp=0xc000059fe8 sp=0xc000059fe0 pc=0x46c9c1
created by runtime.gcenable
runtime/mgc.go:182 +0x65
goroutine 1 [runnable]:
flag.sortFlags(0x0)
flag/flag.go:353 +0x51
flag.(*FlagSet).Visit(0xc000082d80, 0xc000461b48)
flag/flag.go:407 +0x25
github.com/urfave/cli/v2.normalizeFlags({0xc000082c80, 0x3, 0xc0000b2d80}, 0xc000327200)
github.com/urfave/cli/[email protected]/context.go:174 +0xec
github.com/urfave/cli/v2.(*Command).parseFlags(0xc0000b2d80, {0x1e12ab8, 0xc0002425b8}, 0xb0)
github.com/urfave/cli/[email protected]/command.go:194 +0x1d8
github.com/urfave/cli/v2.(*Command).Run(0xc0000b2d80, 0xc000082900)
github.com/urfave/cli/[email protected]/command.go:106 +0x1ef
github.com/urfave/cli/v2.(*App).RunContext(0xc0001a0680, {0x1df3d48, 0xc000122000}, {0xc000134000, 0x2, 0x2})
github.com/urfave/cli/[email protected]/app.go:313 +0x81e
github.com/urfave/cli/v2.(*App).Run(...)
github.com/urfave/cli/[email protected]/app.go:224
main.main()
sigs.k8s.io/security-profiles-operator/cmd/security-profiles-operator/main.go:188 +0xd70
goroutine 19 [chan receive]:
k8s.io/klog/v2.(*loggingT).flushDaemon(0x0)
k8s.io/klog/[email protected]/klog.go:1283 +0x6a
created by k8s.io/klog/v2.init.0
k8s.io/klog/[email protected]/klog.go:420 +0xfb
goroutine 20 [sleep]:
time.Sleep(0x6fc23ac00)
runtime/time.go:193 +0x12e
sigs.k8s.io/controller-runtime/pkg/log.init.0.func1()
sigs.k8s.io/[email protected]/pkg/log/log.go:63 +0x38
created by sigs.k8s.io/controller-runtime/pkg/log.init.0
sigs.k8s.io/[email protected]/pkg/log/log.go:62 +0x25

Environment:

  • Cloud provider or hardware configuration:k8s
  • OS (e.g: cat /etc/os-release):ubuntu 20.04
  • Kernel (e.g. uname -a):Linux version 5.13.0-39-generic (buildd@lcy02-amd64-080) (gcc (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0, GNU ld (GNU Binutils for Ubuntu) 2.34) #44~20.04.1-Ubuntu SMP Thu Mar 24 16:43:35 UTC 2022
  • Others:

neblen avatar Apr 08 '22 06:04 neblen

This seems to be an issue with controller runtime :thinking: Hm, never seen this before.

saschagrunert avatar Apr 08 '22 07:04 saschagrunert

I deploy two worker nodes.The log of the second node is as follows.

 "msg"="Set logging verbosity to 0"
I0418 05:30:48.045731 1 logr.go:252] "msg"="Profiling support enabled: false"
I0418 05:30:48.045731 1 logr.go:252] setup "msg"="starting component: spod" "buildDate"="1980-01-01T00:00:00Z" "compiler"="gc" "gitCommit"="unknown" "gitTreeState"="clean" "goVersion"="go1.17.6" "libbpf"="0.6.1" "libseccomp"="2.5.3" "platform"="linux/amd64" "version"="0.4.1"
E0418 05:32:32.045731 1 logr.go:270] "msg"="Failed to get API Group-Resources" "error"="Get \"https://10.43.0.1:443/api?timeout=32s\": dial tcp 10.43.0.1:443: i/o timeout"
E0418 05:32:32.045731 1 logr.go:270] setup "msg"="running security-profiles-operator" "error"="create manager: Get \"https://10.43.0.1:443/api?timeout=32s\": dial tcp 10.43.0.1:443: i/o timeout"

neblen avatar Apr 08 '22 07:04 neblen

Do you have cert manager deployed and is it healthy?

saschagrunert avatar Apr 08 '22 07:04 saschagrunert

@saschagrunert Thank you for your reply.I have deployed cert manager and it runs correctly.The cert manager version is v1.6.1.

neblen avatar Apr 08 '22 08:04 neblen

Alright, so can this be closed now? :)

saschagrunert avatar Apr 08 '22 09:04 saschagrunert

I’m sorry.The cert manager runs correctly,but spod can't run.

neblen avatar Apr 08 '22 09:04 neblen

btw @bachradsusi was able to reproduce this issue locally. I don't know what causes it.

jhrozek avatar Apr 25 '22 18:04 jhrozek

I am now also encountering this issue when running the e2e tests locally on kind:

I0907 12:15:04.002569       1 logr.go:261]  "msg"="Set logging verbosity to 0"
I0907 12:15:36.002569       1 logr.go:261]  "msg"="Profiling support enabled: false"
I0907 12:00:56.002568       1 logr.go:261] setup "msg"="starting component: spod"  "buildDate"="1980-01-01T00:00:00Z" "buildTags"="netgo,osusergo,seccomp,apparmor" "cgoldFlags"="-lseccomp -lelf -lz -lbpf" "compiler"="gc" "dependencies"="github.com/ReneKroon/ttlcache/v2 v2.11.0 ,github.com/acobaugh/osrelease v0.0.0-20181218015638-a93a0a55a249 ,github.com/aquasecurity/libbpfgo v0.2.5-libbpf-0.7.0 ,github.com/beorn7/perks v1.0.1 ,github.com/blang/semver v3.5.1+incompatible ,
github.com/cespare/xxhash/v2 v2.1.2 ,github.com/containers/common v0.47.5 ,github.com/cpuguy83/go-md2man/v2 v2.0.1 ,github.com/crossplane/crossplane-runtime v0.14.1-0.20210713194031-85b19c28ea88 ,github.com/davecgh/go-spew v1.1.1 ,github.com/evanphx/json-patch v4.12.0+incompatible ,github.com/fsnotify/fsnotify v1.5.1 ,github.com/go-logr/logr v1.2.3 ,github.com/gogo/protobuf v1.3.2 ,github.com/golang/groupcache v0.0.0-20210331224755-41bb18bfe9da ,github.com/golang/protobuf
 v1.5.2 ,github.com/google/go-cmp v0.5.6 ,github.com/google/gofuzz v1.2.0 ,github.com/google/uuid v1.3.0 ,github.com/googleapis/gnostic v0.5.5 ,github.com/imdario/mergo v0.3.12 ,github.com/jetstack/cert-manager v1.7.2 ,github.com/json-iterator/go v1.1.12 ,github.com/matttproud/golang_protobuf_extensions v1.0.2-0.20181231171920-c182affec369 ,github.com/modern-go/concurrent v0.0.0-20180306012644-bacd9c7ef1dd ,github.com/modern-go/reflect2 v1.0.2 ,github.com/nxadm/tail v1.4.
8 ,github.com/opencontainers/runtime-spec v1.0.3-0.20210326190908-1c3f411f0417 ,github.com/openshift/api v0.0.0-20220209124712-b632c5fc10c0 ,github.com/pjbgf/go-apparmor v0.0.7 ,github.com/pkg/errors v0.9.1 ,github.com/prometheus-operator/prometheus-operator/pkg/apis/monitoring v0.56.0 ,github.com/prometheus/client_golang v1.12.1 ,github.com/prometheus/client_model v0.2.0 ,github.com/prometheus/common v0.32.1 ,github.com/prometheus/procfs v0.7.3 ,github.com/russross/black
friday/v2 v2.1.0 ,github.com/seccomp/libseccomp-golang v0.9.2-0.20210429002308-3879420cc921 ,github.com/sirupsen/logrus v1.8.1 ,github.com/spf13/afero v1.6.0 ,github.com/spf13/pflag v1.0.5 ,github.com/urfave/cli/v2 v2.4.0 ,golang.org/x/net v0.0.0-20220225172249-27dd8689420f ,golang.org/x/oauth2 v0.0.0-20211104180415-d3ed0bb246c8 ,golang.org/x/sync v0.0.0-20210220032951-036812b2e83c ,golang.org/x/sys v0.0.0-20220315194320-039c03cc5b86 ,golang.org/x/term v0.0.0-202109272227
41-03fcf44c2211 ,golang.org/x/text v0.3.7 ,golang.org/x/time v0.0.0-20210723032227-1f47c861a9ac ,gomodules.xyz/jsonpatch/v2 v2.2.0 ,google.golang.org/genproto v0.0.0-20220118154757-00ab72f36ad5 ,google.golang.org/grpc v1.45.0 ,google.golang.org/protobuf v1.28.0 ,gopkg.in/inf.v0 v0.9.1 ,gopkg.in/tomb.v1 v1.0.0-20141024135613-dd632973f1e7 ,gopkg.in/yaml.v2 v2.4.0 ,gopkg.in/yaml.v3 v3.0.0-20210107192922-496545a6307b ,k8s.io/api v0.23.6 ,k8s.io/apiextensions-apiserver v0.23.5
 ,k8s.io/apimachinery v0.23.6 ,k8s.io/client-go v0.23.6 ,k8s.io/component-base v0.23.5 ,k8s.io/klog/v2 v2.60.1 ,k8s.io/kube-openapi v0.0.0-20211115234752-e816edb12b65 ,k8s.io/utils v0.0.0-20220210201930-3a6ce19ff2f9 ,sigs.k8s.io/controller-runtime v0.11.2 ,sigs.k8s.io/json v0.0.0-20211020170558-c049b76a60c6 ,sigs.k8s.io/release-utils v0.6.0 ,sigs.k8s.io/structured-merge-diff/v4 v4.2.1 ,sigs.k8s.io/yaml v1.3.0 " "gitCommit"="4b6bac26f9c0b151ea3183087f924a0ee6d9ed54" "gitCo
mmitDate"="2022-04-28T09:18:43Z" "gitTreeState"="dirty" "goVersion"="go1.18" "ldFlags"="-s -w -linkmode external -extldflags \"-static\" -X sigs.k8s.io/security-profiles-operator/internal/pkg/version.buildDate=1980-01-01T00:00:00Z -X sigs.k8s.io/security-profiles-operator/internal/pkg/version.version=0.4.3-dev" "libbpf"="v0.7" "libseccomp"="2.5.3" "platform"="linux/amd64" "version"="0.4.3-dev"
fatal error: timer when must be positive

goroutine 1 [running]:
runtime.throw({0x1b747fa?, 0xc000500000?})
        runtime/panic.go:992 +0x71 fp=0xc000272348 sp=0xc000272318 pc=0x43a1d1
runtime.addtimer(0x48ca01?)
        runtime/time.go:255 +0x12e fp=0xc000272390 sp=0xc000272348 pc=0x45928e
time.startTimer(0x46ae09?)
        runtime/time.go:213 +0x19 fp=0xc0002723a8 sp=0xc000272390 pc=0x469df9
time.NewTimer(0x1e46700?)
        time/sleep.go:96 +0xb7 fp=0xc0002723e8 sp=0xc0002723a8 pc=0x48cb97
net/http.setRequestCancel(0xc00018b400, {0x1e46700?, 0xc0005994a0}, {0xc000329aa0?, 0x10?, 0x2b85740?})
        net/http/client.go:394 +0x375 fp=0xc0002724c8 sp=0xc0002723e8 pc=0x689235
net/http.send(0xc00018b300, {0x1e46700, 0xc0005994a0}, {0x1aee740?, 0x48e901?, 0x2b85740?})
        net/http/client.go:250 +0x5a6 fp=0xc0002726a8 sp=0xc0002724c8 pc=0x6886e6
net/http.(*Client).send(0xc0000252c0, 0xc00018b300, {0x7fb1655323b0?, 0xf8?, 0x2b85740?})
        net/http/client.go:176 +0x9b fp=0xc000272720 sp=0xc0002726a8 pc=0x687fbb
net/http.(*Client).do(0xc0000252c0, 0xc00018b300)
        net/http/client.go:725 +0x8f5 fp=0xc000272920 sp=0xc000272720 pc=0x68a3f5
net/http.(*Client).Do(...)
        net/http/client.go:593
k8s.io/client-go/rest.(*Request).request(0xc00018b100, {0x1e5b9c8, 0xc000345c20}, 0x1?)
        k8s.io/[email protected]/rest/request.go:980 +0x419 fp=0xc000272a20 sp=0xc000272920 pc=0x10c1b19
k8s.io/client-go/rest.(*Request).Do(0x2?, {0x1e5b990?, 0xc00011a008?})
        k8s.io/[email protected]/rest/request.go:1038 +0xc7 fp=0xc000272ad0 sp=0xc000272a20 pc=0x10c2567
k8s.io/client-go/discovery.(*DiscoveryClient).ServerGroups(0xc0005994e0)
        k8s.io/[email protected]/discovery/discovery_client.go:161 +0xae fp=0xc000272d38 sp=0xc000272ad0 pc=0x1232b0e
k8s.io/client-go/discovery.ServerGroupsAndResources({0x1e60dd8, 0xc0005994e0})
        k8s.io/[email protected]/discovery/discovery_client.go:258 +0x42 fp=0xc000272ea8 sp=0xc000272d38 pc=0x12338c2
k8s.io/client-go/discovery.(*DiscoveryClient).ServerGroupsAndResources.func1()
        k8s.io/[email protected]/discovery/discovery_client.go:223 +0x25 fp=0xc000272ec8 sp=0xc000272ea8 pc=0x12335c5
k8s.io/client-go/discovery.withRetries(0x2, 0xc000272ef0)
        k8s.io/[email protected]/discovery/discovery_client.go:452 +0x72 fp=0xc000272ee0 sp=0xc000272ec8 pc=0x1235712
k8s.io/client-go/discovery.(*DiscoveryClient).ServerGroupsAndResources(0x0?)
        k8s.io/[email protected]/discovery/discovery_client.go:222 +0x3a fp=0xc000272f10 sp=0xc000272ee0 pc=0x123357a
k8s.io/client-go/restmapper.GetAPIGroupResources({0x1e60dd8?, 0xc0005994e0?})
        k8s.io/[email protected]/restmapper/discovery.go:148 +0x42 fp=0xc0002730f8 sp=0xc000272f10 pc=0x12383c2
sigs.k8s.io/controller-runtime/pkg/client/apiutil.NewDynamicRESTMapper.func1()
        sigs.k8s.io/[email protected]/pkg/client/apiutil/dynamicrestmapper.go:85 +0x25 fp=0xc000273120 sp=0xc0002730f8 pc=0x12399e5
sigs.k8s.io/controller-runtime/pkg/client/apiutil.(*dynamicRESTMapper).setStaticMapper(...)
        sigs.k8s.io/[email protected]/pkg/client/apiutil/dynamicrestmapper.go:118
sigs.k8s.io/controller-runtime/pkg/client/apiutil.NewDynamicRESTMapper(0xc0000cefc0?, {0x0, 0x0, 0x766a0abb650a7301?})
        sigs.k8s.io/[email protected]/pkg/client/apiutil/dynamicrestmapper.go:98 +0x16b fp=0xc000273168 sp=0xc000273120 pc=0x123992b
sigs.k8s.io/controller-runtime/pkg/cluster.setOptionsDefaults.func1(0x0?)
        sigs.k8s.io/[email protected]/pkg/cluster/cluster.go:216 +0x25 fp=0xc000273198 sp=0xc000273168 pc=0x1486985
sigs.k8s.io/controller-runtime/pkg/cluster.New(0xc0002d7440, {0xc0002737f0, 0x1, 0x0?})
        sigs.k8s.io/[email protected]/pkg/cluster/cluster.go:158 +0x18d fp=0xc000273440 sp=0xc000273198 pc=0x14863ad
sigs.k8s.io/controller-runtime/pkg/manager.New(_, {0x0, 0x0, 0x2b0aaf0, {{0x1e5e610, 0xc00004a0c0}, 0x0}, 0x0, {0x0, 0x0}, ...})
        sigs.k8s.io/[email protected]/pkg/manager/manager.go:313 +0xdc fp=0xc000273868 sp=0xc000273440 pc=0x14b5ffc
main.runDaemon(0xc00015f6c0, 0xc00015f6c0?)
        sigs.k8s.io/security-profiles-operator/cmd/security-profiles-operator/main.go:401 +0x1e2 fp=0xc000273bf8 sp=0xc000273868 pc=0x1618a22
main.main.func3(0xc000229680?)
        sigs.k8s.io/security-profiles-operator/cmd/security-profiles-operator/main.go:131 +0x1d fp=0xc000273c18 sp=0xc000273bf8 pc=0x161745d
github.com/urfave/cli/v2.(*Command).Run(0xc000229680, 0xc00015f4c0)
        github.com/urfave/cli/[email protected]/command.go:163 +0x5bb fp=0xc000273d28 sp=0xc000273c18 pc=0xbfbe9b
github.com/urfave/cli/v2.(*App).RunContext(0xc00058aea0, {0x1e5b990?, 0xc00011a000}, {0xc00012c000, 0x2, 0x2})
        github.com/urfave/cli/[email protected]/app.go:313 +0xb48 fp=0xc000273eb8 sp=0xc000273d28 pc=0xbf9688
github.com/urfave/cli/v2.(*App).Run(...)
        github.com/urfave/cli/[email protected]/app.go:224
main.main()
        sigs.k8s.io/security-profiles-operator/cmd/security-profiles-operator/main.go:219 +0x107b fp=0xc000273f80 sp=0xc000273eb8 pc=0x16172bb
runtime.main()
        runtime/proc.go:250 +0x212 fp=0xc000273fe0 sp=0xc000273f80 pc=0x43c8f2
runtime.goexit()
        runtime/asm_amd64.s:1571 +0x1 fp=0xc000273fe8 sp=0xc000273fe0 pc=0x46cfc1

ccojocar avatar Apr 28 '22 11:04 ccojocar

My assumption is that this might be related to the system timer. It might be caused by a desynchronisation but I am not totally sure.

ccojocar avatar Apr 28 '22 11:04 ccojocar

I'm occasionally hitting the issue also with OCP builds:

fatal error: timer when must be positive

goroutine 19 [running]:
runtime.throw({0x1af70a3?, 0x2c6a980?})
	runtime/panic.go:992 +0x71 fp=0xc00006ae88 sp=0xc00006ae58 pc=0x4394b1
runtime.modtimer(0xb42b06ab9201df43?, 0xc000000000?, 0xc0003be000?, 0x424f00?, {0x0?, 0x0?}, 0x0?)
	runtime/time.go:426 +0x4a7 fp=0xc00006aed0 sp=0xc00006ae88 pc=0x459127
runtime.resettimer(...)
	runtime/time.go:540
time.resetTimer(...)
	runtime/time.go:230
runtime.scavengeSleep(0x4631da68f2e17000)
	runtime/mgcscavenge.go:244 +0x73 fp=0xc00006af20 sp=0xc00006aed0 pc=0x424893
runtime.bgscavenge(0x0?)
	runtime/mgcscavenge.go:394 +0x1af fp=0xc00006afc8 sp=0xc00006af20 pc=0x424aef
runtime.gcenable.func2()
	runtime/mgc.go:178 +0x26 fp=0xc00006afe0 sp=0xc00006afc8 pc=0x41c926
runtime.goexit()
	runtime/asm_amd64.s:1571 +0x1 fp=0xc00006afe8 sp=0xc00006afe0 pc=0x46c181
created by runtime.gcenable
	runtime/mgc.go:178 +0xaa

goroutine 1 [runnable]:
syscall.Syscall6(0x106, 0xffffffffffffff9c, 0xc0000463f0, 0xc000544108, 0x0, 0x0, 0x0)
	syscall/asm_linux_amd64.s:43 +0x5
syscall.fstatat(0xc000356000?, {0x7fff7a5bdac1?, 0xc000356000?}, 0xc000544108?, 0xc0000545c0?)
	syscall/zsyscall_linux_amd64.go:1450 +0x10f
syscall.Stat(...)
	syscall/syscall_linux_amd64.go:60
os.statNolog.func1(...)
	os/stat_unix.go:32
os.ignoringEINTR(...)
	os/file_posix.go:245
os.statNolog({0x7fff7a5bdac1, 0x29})
	os/stat_unix.go:31 +0x58
os.Stat({0x7fff7a5bdac1, 0x29})
	os/stat.go:13 +0x34
github.com/urfave/cli/v2.compileTime()
	github.com/urfave/cli/[email protected]/app.go:104 +0x33
github.com/urfave/cli/v2.NewApp()
	github.com/urfave/cli/[email protected]/app.go:121 +0x77
main.main()
	sigs.k8s.io/security-profiles-operator/cmd/security-profiles-operator/main.go:79 +0x2b

goroutine 21 [sleep]:
time.Sleep(0x6fc23ac00)
	runtime/time.go:194 +0x12e
sigs.k8s.io/controller-runtime/pkg/log.init.0.func1()
	sigs.k8s.io/[email protected]/pkg/log/log.go:63 +0x38
created by sigs.k8s.io/controller-runtime/pkg/log.init.0
	sigs.k8s.io/[email protected]/pkg/log/log.go:62 +0x25

We've seen this today on a 6-node cluster, the interesting thing was that only one of six nodes was showing the issue. I didn't find anything different on that node compared to other nodes (time was in sync and set to UTC on all nodes for example)

jhrozek avatar May 16 '22 09:05 jhrozek

I really wonder if this can be an issue with the Go version we are currently using? Because all the frames in the stack trace of the go routine that is crashing seem to be pointing to the go stdlib:

runtime.throw({0x1af70a3?, 0x2c6a980?})
	runtime/panic.go:992 +0x71 fp=0xc00006ae88 sp=0xc00006ae58 pc=0x4394b1
runtime.modtimer(0xb42b06ab9201df43?, 0xc000000000?, 0xc0003be000?, 0x424f00?, {0x0?, 0x0?}, 0x0?)
	runtime/time.go:426 +0x4a7 fp=0xc00006aed0 sp=0xc00006ae88 pc=0x459127
runtime.resettimer(...)
	runtime/time.go:540
time.resetTimer(...)
	runtime/time.go:230
runtime.scavengeSleep(0x4631da68f2e17000)
	runtime/mgcscavenge.go:244 +0x73 fp=0xc00006af20 sp=0xc00006aed0 pc=0x424893
runtime.bgscavenge(0x0?)
	runtime/mgcscavenge.go:394 +0x1af fp=0xc00006afc8 sp=0xc00006af20 pc=0x424aef
runtime.gcenable.func2()
	runtime/mgc.go:178 +0x26 fp=0xc00006afe0 sp=0xc00006afc8 pc=0x41c926
runtime.goexit()
	runtime/asm_amd64.s:1571 +0x1 fp=0xc00006afe8 sp=0xc00006afe0 pc=0x46c181
created by runtime.gcenable
	runtime/mgc.go:178 +0xaa

OTOH, the only somewhat related issue I found is https://github.com/golang/go/issues/51341 but that is 1) BSD and 2) doesn't have the "timer when must be positive" issue, but EFAULT instead. If this was really a golang issue, I would expect many more bug reports and the bug being found much earlier...

jhrozek avatar May 16 '22 10:05 jhrozek

I'm currently facing the same issue as well.

+ kubectl logs spod-5bjlf
Defaulted container "security-profiles-operator" out of: security-profiles-operator, metrics, non-root-enabler (init)
fatal error: timer when must be positive

goroutine 4 [running]:
runtime.throw({0x1c412ee?, 0x2cf2e00?})
	runtime/panic.go:992 +0x71 fp=0xc00005fe88 sp=0xc00005fe58 pc=0x43a031
runtime.modtimer(0x10752ed57ccfe0?, 0xc000400000?, 0xc000600000?, 0x425a80?, {0x0?, 0x0?}, 0x0?)
	runtime/time.go:426 +0x4a7 fp=0xc00005fed0 sp=0xc00005fe88 pc=0x459e27
runtime.resettimer(...)
	runtime/time.go:540
time.resetTimer(...)
	runtime/time.go:230
runtime.scavengeSleep(0x8000000000000000)
	runtime/mgcscavenge.go:244 +0x73 fp=0xc00005ff20 sp=0xc00005fed0 pc=0x425413
runtime.bgscavenge(0x0?)
	runtime/mgcscavenge.go:394 +0x1af fp=0xc00005ffc8 sp=0xc00005ff20 pc=0x42566f
runtime.gcenable.func2()
	runtime/mgc.go:178 +0x26 fp=0xc00005ffe0 sp=0xc00005ffc8 pc=0x41d4a6
runtime.goexit()
	runtime/asm_amd64.s:1571 +0x1 fp=0xc00005ffe8 sp=0xc00005ffe0 pc=0x46ce81
created by runtime.gcenable
	runtime/mgc.go:178 +0xaa

goroutine 1 [runnable, locked to thread]:
github.com/prometheus/client_golang/prometheus.(*Registry).Register(0xc0005003c0, {0x1f29828?, 0xc00018c090?})
	github.com/prometheus/[email protected]/prometheus/registry.go:356 +0xa1a
github.com/prometheus/client_golang/prometheus.(*Registry).MustRegister(0x0?, {0xc000420580?, 0x1, 0xc0005c6cf0?})
	github.com/prometheus/[email protected]/prometheus/registry.go:402 +0x53
github.com/prometheus/client_golang/prometheus.MustRegister(...)
	github.com/prometheus/[email protected]/prometheus/registry.go:178
github.com/prometheus/client_golang/prometheus.init.0()
	github.com/prometheus/[email protected]/prometheus/registry.go:63 +0xf5

goroutine 6 [sleep, 91230067 minutes]:
time.Sleep(0x6fc23ac00)
	runtime/time.go:194 +0x12e
sigs.k8s.io/controller-runtime/pkg/log.init.0.func1()
	sigs.k8s.io/[email protected]/pkg/log/log.go:63 +0x38
created by sigs.k8s.io/controller-runtime/pkg/log.init.0
	sigs.k8s.io/[email protected]/pkg/log/log.go:62 +0x25

#1017


Events:
  Type     Reason     Age                  From               Message
  ----     ------     ----                 ----               -------
  Normal   Pulling    2m                   kubelet            Pulling image "gcr.io/k8s-staging-sp-operator/security-profiles-operator:latest"
  Normal   Scheduled  2m                   default-scheduler  Successfully assigned security-profiles-operator/spod-sb76b to minikube
  Normal   Pulled     118s                 kubelet            Successfully pulled image "gcr.io/k8s-staging-sp-operator/security-profiles-operator:latest" in 1.792761921s
  Normal   Created    117s                 kubelet            Created container non-root-enabler
  Normal   Started    117s                 kubelet            Started container non-root-enabler
  Normal   Pulled     114s                 kubelet            Successfully pulled image "gcr.io/k8s-staging-sp-operator/security-profiles-operator:latest" in 1.445520245s
  Normal   Started    114s                 kubelet            Started container security-profiles-operator
  Normal   Pulled     114s                 kubelet            Container image "quay.io/brancz/kube-rbac-proxy:v0.11.0" already present on machine
  Normal   Started    113s                 kubelet            Started container metrics
  Normal   Created    113s                 kubelet            Created container metrics
  Normal   Killing    84s                  kubelet            Container security-profiles-operator failed startup probe, will be restarted
  Normal   Pulling    84s (x2 over 116s)   kubelet            Pulling image "gcr.io/k8s-staging-sp-operator/security-profiles-operator:latest"
  Warning  Unhealthy  84s (x10 over 111s)  kubelet            Startup probe failed: Get "http://10.244.0.27:8085/healthz": dial tcp 10.244.0.27:8085: connect: connection refused
  Normal   Created    82s (x2 over 114s)   kubelet            Created container security-profiles-operator
  Normal   Pulled     82s                  kubelet            Successfully pulled image "gcr.io/k8s-staging-sp-operator/security-profiles-operator:latest" in 1.931630015s

B3ns44d avatar Jul 06 '22 18:07 B3ns44d

I think that the fatal error: timer when must be positive is related to the addtimer func

// addtimer adds a timer to the current P.
// This should only be called with a newly created timer.
// That avoids the risk of changing the when field of a timer in some P's heap,
// which could cause the heap to become unsorted.
func addtimer(t *timer) {
	// when must be positive. A negative value will cause runtimer to
	// overflow during its delta calculation and never expire other runtime
	// timers. Zero will cause checkTimers to fail to notice the timer.
	if t.when <= 0 {
		throw("timer when must be positive")
	}
	if t.period < 0 {
		throw("timer period must be non-negative")
	}
	if t.status != timerNoStatus {
		throw("addtimer called with initialized timer")
	}
	t.status = timerWaiting

	when := t.when

	// Disable preemption while using pp to avoid changing another P's heap.
	mp := acquirem()

	// Get current P
	pp := getg().m.p.ptr()
	lock(&pp.timersLock)
	// Clear the timer in the header of P's timer list
	cleantimers(pp)
	// add timer to the min heap of P
	doaddtimer(pp, t)
	unlock(&pp.timersLock)
	// wake up sleeping threads in netpoller
	wakeNetPoller(when)

	releasem(mp)
}

  1. addtimer checks the time when the timer is woken up and that the status must be the newly initialized timer.
  2. cleantimers are then called to clean up the head node of the corresponding timer list in P after the lock has been added, and doaddtimer is called to add the timer to the smallest heap of P and release the lock.
  3. wakeNetPoller is called to wake up a dormant thread in the netpoller.
  • addtimer func: https://github.com/golang/go/blob/master/src/runtime/time.go#L256
  • modtimer func: https://github.com/golang/go/blob/master/src/runtime/time.go#L430

references:

  • https://www.sobyte.net/post/2022-01/go-timer-analysis/
  • https://blog.csdn.net/weixin_44505163/article/details/121191165
  • https://github.com/golang/go/blob/master/src/runtime/time.go

B3ns44d avatar Jul 07 '22 10:07 B3ns44d

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot avatar Oct 05 '22 11:10 k8s-triage-robot

/remove-lifecycle stale we have a similar bug open against OCP at the moment and while we couldn't reproduce it lately, I'd prefer to keep this issue open for one more "stale/rotten" cycle before closing.

jhrozek avatar Oct 05 '22 13:10 jhrozek

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot avatar Jan 03 '23 14:01 k8s-triage-robot

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

k8s-triage-robot avatar Feb 02 '23 15:02 k8s-triage-robot

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Reopen this issue with /reopen
  • Mark this issue as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

k8s-triage-robot avatar Mar 04 '23 16:03 k8s-triage-robot

@k8s-triage-robot: Closing this issue, marking it as "Not Planned".

In response to this:

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Reopen this issue with /reopen
  • Mark this issue as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

k8s-ci-robot avatar Mar 04 '23 16:03 k8s-ci-robot

Did anyone have a workaround to this problem? I'm running into exactly the same problem with the error message: "fatal error: timer when must be positive" when installing using Helm. I'm running a kind cluster. All the pods in the cluster are running except the spod.

huang195 avatar May 09 '23 16:05 huang195

/reopen

CoreyCook8 avatar Feb 12 '24 16:02 CoreyCook8

@CoreyCook8: You can't reopen an issue/PR unless you authored it or you are a collaborator.

In response to this:

/reopen

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

k8s-ci-robot avatar Feb 12 '24 16:02 k8s-ci-robot

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Reopen this issue with /reopen
  • Mark this issue as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

k8s-triage-robot avatar Mar 14 '24 08:03 k8s-triage-robot

@k8s-triage-robot: Closing this issue, marking it as "Not Planned".

In response to this:

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Reopen this issue with /reopen
  • Mark this issue as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

k8s-ci-robot avatar Mar 14 '24 08:03 k8s-ci-robot