piraeus-operator icon indicating copy to clipboard operation
piraeus-operator copied to clipboard

drbd-moule-loade having an issue with linux kernel 4.18.0-553.62.1.el8_10.x86_64

Open saarthak18 opened this issue 5 months ago • 7 comments

Hey i am facing an issue of drbd-module-loader init container being stuck in crashloop state with following error , the setup on which this error occurs is on kernel version 4.18.0-553.62.1.el8_10.x86_64

make -C /lib/modules/4.18.0-553.62.1.el8_10.x86_64/build "PRE_CFLAGS=" M=/tmp/pkg/drbd-9.2.14/drbd obj-m=dummy-for-prep.o dummy-for-patch.o make -C /tmp/pkg/drbd-9.2.14/drbd -f Makefile.spatch /tmp/pkg/drbd-9.2.14/drbd/build-4.18.0-553.62.1.el8_10.x86_64/compat.patch GENPATCHNAMES 4.18.0-553.62.1.el8_10.x86_64 No local spatch found. No (suitable) spatch found in $PATH. INFO: spatch failed, or no suitable spatch found; trying spatch-as-a-service; be patient, may take up to 10 minutes. If it is in the server side cache it might only take a second. SPAAS 7b50a3a3b2a2c5c82fd5a60943dc6df6

i am running the linstor cluster on a different setup where the kernel version is 4.18.0-553.51.1.el8_10.x86_64 and there is not issue there

I am on RHEL 8 and the drbd-moule-loader is on version 9.2.14 , is there a newer version i have to upgrade to?

saarthak18 avatar Jul 26 '25 12:07 saarthak18

I'm guessing the important information at the end is missing from the logs. It looks like for this kernel we do not have a pre-computed patch. So we need to generate one using spatch. However, RHEL8 does not have the necessary version of spatch available. Which is why we try to contact "SPAAS" (spatch-as-a-service), which LINBIT hosts as convenience service.

If you are somehow unable to contact SPAAS (i.e. because you are in some kind of disconnected environment), you should see a message like:

curl: (6) Could not resolve host: spaas.drbd.io
  ERROR: SPAAS is not reachable! Please check if your network
  configuration or some firewall prohibits access to 
  'https://spaas.drbd.io'.

Note: LINBIT provides prebuilt kernel modules for RHEL, so if you would use the LINBIT SDS offering, you would have no need for manually building DRBD on the node.

WanzenBug avatar Jul 28 '25 11:07 WanzenBug

Hi Wanzen, sorry for missing out on the entire error message

This is the error i am facing

UPD /tmp/pkg/drbd-9.2.14/drbd/build-4.18.0-553.62.1.el8_10.x86_64/compat.h UPD /tmp/pkg/drbd-9.2.14/drbd/build-4.18.0-553.62.1.el8_10.x86_64/.drbd_kernelrelease LN build-current -> build-4.18.0-553.62.1.el8_10.x86_64/ LN compat.h -> build-4.18.0-553.62.1.el8_10.x86_64/compat.h LN .compat_test -> build-4.18.0-553.62.1.el8_10.x86_64/.compat_test LN compat.4.18.0-553.62.1.el8_10.x86_64.h -> build-4.18.0-553.62.1.el8_10.x86_64/compat.h LN .compat_test.4.18.0-553.62.1.el8_10.x86_64 -> build-4.18.0-553.62.1.el8_10.x86_64/.compat_test make -C /lib/modules/4.18.0-553.62.1.el8_10.x86_64/build "PRE_CFLAGS=" M=/tmp/pkg/drbd-9.2.14/drbd obj-m=dummy-for-prep.o dummy-for-patch.o make -C /tmp/pkg/drbd-9.2.14/drbd -f Makefile.spatch /tmp/pkg/drbd-9.2.14/drbd/build-4.18.0-553.62.1.el8_10.x86_64/compat.patch GENPATCHNAMES 4.18.0-553.62.1.el8_10.x86_64 No local spatch found. No (suitable) spatch found in $PATH. INFO: spatch failed, or no suitable spatch found; trying spatch-as-a-service; be patient, may take up to 10 minutes. If it is in the server side cache it might only take a second. SPAAS 7b50a3a3b2a2c5c82fd5a60943dc6df6 curl: (6) Could not resolve host: spaas.drbd.io ERROR: SPAAS is not reachable! Please check if your network configuration or some firewall prohibits access to 'https://spaas.drbd.io'. make[4]: *** [Makefile.spatch:55: drbd-kernel-compat/cocci_cache/7b50a3a3b2a2c5c82fd5a60943dc6df6/compat.patch] Error 1 make[3]: *** [/tmp/pkg/drbd-9.2.14/drbd/Kbuild:147: /tmp/pkg/drbd-9.2.14/drbd/build-4.18.0-553.62.1.el8_10.x86_64/compat.patch] Error 2 make[2]: *** [Makefile:1808: dummy-for-patch.o] Error 2 make[1]: Leaving directory '/tmp/pkg/drbd-9.2.14/drbd' make[1]: *** [Makefile:244: prep] Error 2 make: *** [Makefile:131: module] Error 2

we are right now using k8s version of LINBIT and this the error i am facing in the initContainer called drbd-module loader

we deploy the piraeus helm chart first and then install the linstor cluster chart for getting linstor cluster up and running

is LINBIT SDS something different from this ?

saarthak18 avatar Jul 28 '25 11:07 saarthak18

Are you using the images from drbd.io? if not, you are not using the LINBIT version.

WanzenBug avatar Jul 28 '25 12:07 WanzenBug

we are using quay.io/piraeusdatastore/drbd9-almalinux8:v9.2.14 for drbd-module loader would https://drbd.io/repo/drbd9-rhel8 be its substitue?

How do i pull the images from this private repo?Are these paid versions?

and how do i resolve this using opensource version

saarthak18 avatar Jul 28 '25 13:07 saarthak18

You need to be a LINBIT customer, then you can use the drbd.io registry.

Alternatively, you can host your own spatch-as-a-service: https://github.com/LINBIT/saas Deploy the docker image somewhere and set the environment variable in the loader to LB_MAKEOPTS=SPAAS_URL=<some-internal-url>

WanzenBug avatar Jul 29 '25 05:07 WanzenBug

Hey @WanzenBug i am trying to set the env varaible since the dbrb-module-loader is deployed through piraeus-operators crds i am not exactly sure where the define this env variable can you help me with it.

i tried to set it here https://github.com/piraeusdatastore/piraeus-operator/blob/v2/charts/piraeus/templates/crds.yaml#L1092 then it gave a warning "spec.versions[0].schema.openAPIV3Schema.properties.spec.properties.podTemplate.spec" in this way

podTemplate: spec: initContainers: - name: drbd-module-loader command: ["/bin/sh"] args: ["-c", "echo Sleeping... && sleep 10000"] env: - name: LB_MAKEOPTS value: "https://your.internal.url"

i have also tried to set it under opertor https://github.com/piraeusdatastore/piraeus-operator/blob/v2/charts/piraeus/templates/crds.yaml#L969 in this way operator: satelliteSet: additionalEnv: - name: SPAAS_URL value: "http://spaas.nsp-psa-privileged.svc.cluster.local:93"

Note: i tried to add sleep to debug

saarthak18 avatar Jul 29 '25 11:07 saarthak18

Hi @WanzenBug i used the following CR to configure the drbd-module loader ans sucessfully applied the SPAAS_URL env variable

apiVersion: piraeus.io/v1 kind: LinstorSatelliteConfiguration metadata: name: satellite-initcontainer-config spec: podTemplate: spec: initContainers: - name: drbd-module-loader env: - name: SPAAS_URL value: "http://spaas.nsp-psa-privileged.svc.cluster.local:93"

but i am getting the following error

`make -C /lib/modules/4.18.0-553.62.1.el8_10.x86_64/build    "PRE_CFLAGS=" M=/tmp/pkg/drbd-9.2.14/drbd obj-m=dummy-for-prep.o dummy-for-patch.o
make -C /tmp/pkg/drbd-9.2.14/drbd -f Makefile.spatch /tmp/pkg/drbd-9.2.14/drbd/build-4.18.0-553.62.1.el8_10.x86_64/compat.patch
  GENPATCHNAMES   4.18.0-553.62.1.el8_10.x86_64
    No local spatch found.
  No (suitable) spatch found in $PATH.
  INFO: spatch failed, or no suitable spatch found; trying spatch-as-a-service;
  be patient, may take up to 10 minutes.
  If it is in the server side cache it might only take a second.
  SPAAS    7b50a3a3b2a2c5c82fd5a60943dc6df6
Successfully connected to SPAAS ('49e625caa996f2ddcb088726589d90acc5965825')
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  8929    0     0    0  8929      0   2003 --:--:--  0:00:04 --:--:--  2002
curl: (22) The requested URL returned error: 400 Bad Request
=== pipestatus: 0 22
HTTP/1.1 100 Continue

cat: drbd-kernel-compat/cocci_cache/7b50a3a3b2a2c5c82fd5a60943dc6df6/compat.patch.tmp: No such file or directory
make[4]: *** [Makefile.spatch:55: drbd-kernel-compat/cocci_cache/7b50a3a3b2a2c5c82fd5a60943dc6df6/compat.patch] Error 1
make[3]: *** [/tmp/pkg/drbd-9.2.14/drbd/Kbuild:147: /tmp/pkg/drbd-9.2.14/drbd/build-4.18.0-553.62.1.el8_10.x86_64/compat.patch] Error 2
make[2]: *** [Makefile:1808: dummy-for-patch.o] Error 2
make[1]: *** [Makefile:244: prep] Error 2
make[1]: Leaving directory '/tmp/pkg/drbd-9.2.14/drbd'
make: *** [Makefile:131: module] Error 2

Could not find the expexted *.ko, see stderr for more details
`

i also see the following error on the spass container i ran

{"level":"error","ts":1753816926.7899098,"caller":"saas/main.go:409","msg":"Could not generate patch: Get \"https://pkg.linbit.com/downloads/drbd//9/drbd-9.2.14.tar.gz\": dial tcp: lookup pkg.linbit.com: Temporary failure in name resolution","type":"error","remoteAddr":"10.233.90.152:39102","code":400,"stacktrace":"main.(*server).errorf\n\t/go/src/saas/main.go:409\nmain.(*server).routes.(*server).spatchCreate.func1\n\t/go/src/saas/main.go:192\nnet/http.HandlerFunc.ServeHTTP\n\t/usr/local/go/src/net/http/server.go:2294\ngithub.com/gorilla/mux.(*Router).ServeHTTP\n\t/go/pkg/mod/github.com/gorilla/[email protected]/mux.go:212\nmain.(*server).ServeHTTP\n\t/go/src/saas/main.go:125\nnet/http.serverHandler.ServeHTTP\n\t/usr/local/go/src/net/http/server.go:3301\nnet/http.(*conn).serve\n\t/usr/local/go/src/net/http/server.go:2102"}

i have given these env variables

patchcache=/var/cache/saas/patches
tarcache=/var/cache/saas/tarballs

am i going in the right direction?

saarthak18 avatar Jul 29 '25 18:07 saarthak18