parca-agent icon indicating copy to clipboard operation
parca-agent copied to clipboard

Current `main` doesn't start on Scaleway

Open metalmatze opened this issue 2 years ago • 4 comments

It would be great to be able to run the latest version of Parca Agent on our demo.parca.dev instance.

Right now the agent doesn't start on Scaleway Kubernetes nodes logging the following:

level=warn name=parca-agent ts=2022-08-29T12:22:59.162801832Z caller=main.go:139 msg="failed to determine if eBPF is supported" err="kernel config not found"
level=error name=parca-agent ts=2022-08-29T12:22:59.162868702Z caller=main.go:121 err="host kernel does not support eBPF"

The pod is running image ghcr.io/parca-dev/parca-agent:main-5920bd0f with arguments:

      /bin/parca-agent
      --log-level=info
      --node=$(NODE_NAME)
      --remote-store-address=parca.parca.svc.cluster.local:7070
      --remote-store-insecure
      --remote-store-insecure-skip-verify

When kubectl exec into a pod (Parca pod since the agent has no shell inside) it seems that there is no /boot directory available, nor any /proc/config* files.

metalmatze avatar Aug 29 '22 12:08 metalmatze

Could you try mounting the config with a hostPath volume?

spec:
  containers:
  - name: parca-agent
    volumeMounts:
    - mountPath: /boot/config
      name: kconfig
  volumes:
  - name: kconfig
    hostPath:
      path: /path/to/kconfig
      type: File

maxbrunet avatar Aug 29 '22 16:08 maxbrunet

Just tried this and also looked around on a similar host. The files kconfig is looking for don't exist on those machines.

metalmatze avatar Aug 30 '22 10:08 metalmatze

I'll look into where kernel configs are stored in these environments(hint: probably /usr/src). Although I won't get time to tackle this for the next 3 weeks because of the priority issues+conference travel. So until then, I would like to reiterate my suggestion of keeping these logs at the warning level. These checks are for the user's ease and they shouldn't stop users from running the agent right now.

v-thakkar avatar Sep 06 '22 06:09 v-thakkar

@v-thakkar I can have a look at this. Let's talk about it offline.

kakkoyun avatar Sep 06 '22 07:09 kakkoyun

Still happening ghcr.io/parca-dev/parca-agent:main-4ba5c0a1.

metalmatze avatar Oct 06 '22 10:10 metalmatze

While we have reduced the logs to a warning level now, let's keep this open to add more config locations for the checks.

v-thakkar avatar Oct 06 '22 11:10 v-thakkar

While we have reduced the logs to a warning level now, let's keep this open to add more config locations for the checks.

The PR that changed log level https://github.com/parca-dev/parca-agent/pull/875

kakkoyun avatar Oct 06 '22 11:10 kakkoyun

I will add additional start-up checks for similar environments.

kakkoyun avatar Oct 06 '22 11:10 kakkoyun

Just to clarify, the remediation in https://github.com/parca-dev/parca-agent/pull/875 does two things:

  • changes the erroring with a warning log, to prevent the agent from exiting;
  • fixes the logic as right now even if there's an error, we still check the bpfEnabled variable, which might not be correct depending on the error handling semantics of the function;

As you said, this is just remediation to unblock Parca Agent in environments where BPF is supported but the check fails, and we should fix the "BPF support" check

javierhonduco avatar Oct 06 '22 11:10 javierhonduco

The config actually exists, probably, the "uname" value we generate does not math the pattern. I'll investigate further.

CleanShot 2022-10-06 at 18 32 20@2x CleanShot 2022-10-06 at 18 31 58@2x CleanShot 2022-10-06 at 18 31 15@2x CleanShot 2022-10-06 at 18 30 46@2x CleanShot 2022-10-06 at 18 29 42@2x CleanShot 2022-10-06 at 18 27 34@2x

kakkoyun avatar Oct 06 '22 16:10 kakkoyun

What's output of uname -r ?

v-thakkar avatar Oct 06 '22 16:10 v-thakkar

What's output of uname -r ?

5.4.0-122-generic

kakkoyun avatar Oct 06 '22 17:10 kakkoyun

Ok, yeah then ideally it should be able to read the /boot/config-5.4.0-122-generic. Might be worth checking the permissions regarding reading the kernel config files. We may need to find a work around for the same then if that's the case.

v-thakkar avatar Oct 06 '22 17:10 v-thakkar

/boot is not mounted in the agent pods, that fixes it (did a quick try :cowboy_hat_face:):

spec:
  containers:
  - name: parca-agent
    volumeMounts:
    - mountPath: /boot
      name: boot
  volumes:
  - name: boot
    hostPath:
      path: /boot

maxbrunet avatar Oct 06 '22 18:10 maxbrunet