libbpf icon indicating copy to clipboard operation
libbpf copied to clipboard

Can't include btf.h without kernel headers from the 6.0 cycle

Open tohojo opened this issue 3 years ago • 7 comments

The commit https://github.com/libbpf/libbpf/commit/c3f8eecb16ad02ebb3a41a31349f1b851cebe46b added a dereference of struct btf_enum64 in an inline function in libbpf's btf.h. However, struct btf_enum64 does not appear in the kernel header files until the 6.0 cycle, so this makes it impossible to build an application that includes bpf/btf.h without also vendoring in the newer kernel header (from a not-yet-released kernel version).

We hit this in xdp-tools, see https://github.com/xdp-project/xdp-tools/issues/220

tohojo avatar Aug 24 '22 09:08 tohojo

We generally recommend to use UAPI headers bundled with libbpf itself under https://github.com/libbpf/libbpf/tree/master/include/uapi/linux. Would that work for you?

Also please see https://github.com/libbpf/libbpf#libbpf-and-general-bpf-usage-questions, thanks.

anakryiko avatar Aug 26 '22 02:08 anakryiko

Well, we worked around it for xdp-tools. However, this will also break any application that tries to build against a distro-provided version of libbpf. Right now even a distro with a completely up-to-date set of kernel headers (which would be 5.19) won't work. And even if the application doesn't use btf_enum64 compilation still fails because of the inline functions in libbpf's btf.h.

I know you recommend vendoring libbpf as a submodule, but not everyone does that, and distributions are still a thing. We're trying very hard to make libbpf and the rest of BPF work well in a traditional distro environment, it would be helpful if you could at least not explicitly break this, even if you don't want to support it yourself. At the very least, by not relying on kernel headers before they're in a released version of the kernel...

tohojo avatar Aug 26 '22 09:08 tohojo

This wasn't intentional, if that's what you are implying. But it's also not as simple as just re-defining those BTF_KIND_xxx constants (we can't just re-define struct btf_enum64 due to conflicts with UAPI header). Do you have any suggestions on how to fix this, short of removing those new helpers from btf.h?

cc @yonghong-song

anakryiko avatar Aug 26 '22 16:08 anakryiko

This wasn't intentional, if that's what you are implying.

I didn't mean to imply it was intentional, no; sorry if it came out that way. I was just reacting to your response linking to the "recommended usage" document, which I found was a bit dismissive of this being a real bug.

Of course it would be great if we could improve the CI so something like this would get caught automatically in the future. Compiling a trivial program including all the libbpf headers, but without the vendored kernel headers in the include path should be enough I suppose?

But it's also not as simple as just re-defining those BTF_KIND_xxx constants (we can't just re-define struct btf_enum64 due to conflicts with UAPI header). Do you have any suggestions on how to fix this, short of removing those new helpers from btf.h?

It would be nice if we could ifdef the inline helpers somehow, so they are only defined if the struct definition is available. That way it wouldn't hurt any users just including the headers but not actually using any of the helpers. Not sure if there's a good definition to key the ifdef off of, though...

The only other thing I can think of is making the helpers non-inline, so the deref happens in a .c file instead; then an incomplete forward-def should be enough for defining the helper prototype with the struct pointer argument. It's a bit annoying to have such a simple helper be non-inline, though, so not ideal either :(

tohojo avatar Aug 28 '22 19:08 tohojo

We have faced the problem in Debian now. As the libbpf maintainer in Debian I updated the package and now some of the other packages which depends on libbpf fails with:

/usr/include/bpf/btf.h: In function 'btf_enum64_value':
/usr/include/bpf/btf.h:496:25: error: invalid use of undefined type 'const struct btf_enum64'
  496 |         return ((__u64)e->val_hi32 << 32) | e->val_lo32;
      |                         ^~
/usr/include/bpf/btf.h:496:46: error: invalid use of undefined type 'const struct btf_enum64'
  496 |         return ((__u64)e->val_hi32 << 32) | e->val_lo32;
      |                                              ^~

Or as @tohojo mentioned xdp-tools also failed to build in Debian.

Also, the Readme says: No ties to any specific kernel, transparent handling of older kernels. It has built-in mechanisms to gracefully handle older kernels, that are missing some of the features, by working around or gracefully degrading functionality. But it seems we are now tied to v6.0+.

Is it possible to do some kind of versioned dependency as a possible solution please..

sudipm-mukherjee avatar Sep 05 '22 16:09 sudipm-mukherjee

@sudipm-mukherjee I just tagged a v1.2.7 of xdp-tools which should fix the build issue against v1.0.0 of libbpf...

tohojo avatar Sep 05 '22 18:09 tohojo

Thanks @tohojo. It will be upto the Debian maintainer of xdp-tools to update that package. I will point him to the new version. but iiuc, this is just a workaround and not a solution. And, also its not just xdp-tools which saw this problem, its also there in bpfcc and dwarves.

sudipm-mukherjee avatar Sep 06 '22 08:09 sudipm-mukherjee

This should be fixed by https://patchwork.kernel.org/project/netdevbpf/patch/[email protected]/. Once that lands upstream I'll sync this into Github and will cherry-pick as 1.0.1 version.

anakryiko avatar Sep 27 '22 04:09 anakryiko

@tohojo can you please test https://github.com/libbpf/libbpf/tree/libbpf-v1.0.1 in your environment to make sure your original issue is fixed and nothing new cropped up? Thanks.

anakryiko avatar Sep 27 '22 22:09 anakryiko

Yup, the original build issue goes away if I switch libbpf over to using that branch (and revert the change where we pulled in the kernel header from 6.0). Thanks for the fix! :)

tohojo avatar Sep 29 '22 14:09 tohojo

@tohojo thanks for checking!

anakryiko avatar Sep 29 '22 17:09 anakryiko

It seems the issue was not (fully?) resolved, as can be seen in this issue attempting to build dwarves using 5.15 kernel headers: https://github.com/acmel/dwarves/issues/49. Perhaps it should be reopened?

apteryks avatar May 10 '24 12:05 apteryks

Apparently it's an issue in dwarves that requires the btf_enum64 definition.

apteryks avatar May 10 '24 17:05 apteryks