bcc icon indicating copy to clipboard operation
bcc copied to clipboard

`biotop` and `biosnoop` do not work under 5.19 kernel due to missing `blk_account_io_start` kprobe

Open haozhangphd opened this issue 3 years ago • 9 comments

On the latest Arch Linux installation, and Fedora 36 with 5.19 kernel, the kprobes blk_account_io_start as well as __blk_account_io_start are missing, as seen from the /proc/kallsyms file. As a result, both biotop and biosnoop fail to start with the following error messages: Exception: Failed to attach BPF program b'trace_pid_start' to kprobe b'blk_account_io_start', it's not traceable (either non-existing, inlined, or marked as "notrace")

haozhangphd avatar Sep 30 '22 12:09 haozhangphd

This is correct but its caused by linux kernel 450b7879e34517c3ebc3a35a53806fe40e60fac2 and its introduced in 5.17 onwards.

Kernel Devs don't guaranty tracing symbols similar to linux ABI. this is not issue with bcc but linux kernel.

there can be workaround reverting commit mentioned above and compiling kernel manually. If not changing static inline void to void would resolve this.

As this is not related to bcc please close this issue. If need more clarification do ask for more clarification here. I will try my best to answer.

devidasjadhav avatar Oct 06 '22 17:10 devidasjadhav

I think bcc needs to be kept up-to-date with respect to the kernel, and not the other way around. Manually reverting kernel commits in order to make bcc work is not practical.

In the past bcc has always kept pace with the kernel tracing symbol changes, as evidenced by for example 95c9229ea9f029a1b9e8dcbe86fc67f037c0dfa2 and 97c20767923db7ea8c1ec6e07b2297b00992af03. Thus incompatibility with the latest kernel is indeed a bcc issue.

haozhangphd avatar Oct 06 '22 18:10 haozhangphd

Hi.

the kprobes blk_account_io_start as well as __blk_account_io_start are missing

I do not know for biosnoop but the CO-RE version of biotop normally handles the case you point.

This is indeed not a solution to the problem you point, but if you need to track down block I/O you can, for the moment, use the CO-RE version instead of the standard one.

Best regards.

eiffel-fl avatar Oct 12 '22 08:10 eiffel-fl

I do not know for biosnoop but the CO-RE version of biotop normally handles the case you point.

From the source code, it seems the CO-RE version depends on the same two kprobes __blk_account_io_start or blk_account_io_start as the python version? On newer kernels without either of these two kprobes, it seems I cannot use the CO-RE version as well...

haozhangphd avatar Oct 12 '22 15:10 haozhangphd

I do not know for biosnoop but the CO-RE version of biotop normally handles the case you point.

From the source code, it seems the CO-RE version depends on the same two kprobes __blk_account_io_start or blk_account_io_start as the python version? On newer kernels without either of these two kprobes, it seems I cannot use the CO-RE version as well...

Sorry, I read too quickly. After taking a look to above quoted commit, it indeed seems we cannot really do something for this problem... Maybe another workaround would be to mark as noinline the __ functions so we can probe them.

eiffel-fl avatar Oct 12 '22 15:10 eiffel-fl

I tried manually patching the kernel by making __blk_account_io_start noinline, and biotop indeed works. However I don't think this is a practical solution for most of the users, as these bio functions will be unusable for anyone using kernels newer than 5.17. Is there any other kprobe that can be used besides __blk_account_io_start for similar purpose?

haozhangphd avatar Oct 12 '22 22:10 haozhangphd

Is there any other kprobe that can be used besides __blk_account_io_start for similar purpose?

Out of the blue, I do not think so, but maybe someone here has a better idea than me. Nonetheless, I think this could be a good contribution to send your patches adding noinline to __ functions to upstream kernel mailing list. What do you think?

eiffel-fl avatar Oct 13 '22 07:10 eiffel-fl

Maybe we can add a tracepoint for it. I am working on it.

chenhengqi avatar Oct 13 '22 12:10 chenhengqi

I am working on it.

If you need review, you can cc me either here or "flaniel at linux dot microsoft dot com" from upstream kernel mailing list.

eiffel-fl avatar Oct 13 '22 13:10 eiffel-fl

Is there a temporary workaround for this issue? also for newer kernels 6.0 and 6.1rc?

darkblaze69 avatar Nov 18 '22 06:11 darkblaze69

Is there a temporary workaround for this issue? also for newer kernels 6.0 and 6.1rc?

Up to my knowledge, there is no temporary workaround. But you can help review this kernel patch so it can be merged soon.

eiffel-fl avatar Nov 21 '22 07:11 eiffel-fl

indeed, tracepoint is the best option here. There might be an alternative to blk_account_to_io_start() but it might produce a different result or that function might be inlined as well. So ultimately, @chenhengqi suggested tracepoint might be the best long term solution. I see one tracepoint block/block_bio_complete, but didn't dig out whether it is an appropriate replacement or not.

yonghong-song avatar Nov 26 '22 23:11 yonghong-song

Unfortunately, it seems that folio_account_dirtied also has this problem (in libbpf-tools/cachestat) on v6.1:

$ objdump -d page-writeback.o | grep folio_account_dirtied
# got nothing
$ sudo bpftrace -l | grep folio_account_dirtied
# got nothing

cause:

$ sudo ./cachestat 
libbpf: prog 'kprobe_account_page_dirtied': failed to create kprobe 'account_page_dirtied+0x0' perf event: No such file or directory
libbpf: prog 'kprobe_account_page_dirtied': failed to auto-attach: -2
failed to attach BPF programs

Rtoax avatar Feb 14 '23 14:02 Rtoax

Sorry, my mistake, there is a tracepoint:writeback:writeback_dirty_folio, i closed PR(https://github.com/iovisor/bcc/pull/4482), and i'll submit a new one.

Rtoax avatar Feb 15 '23 03:02 Rtoax