linux
linux copied to clipboard
NULL pointer dereference in aspeed_i2c_master_irq
Running a dev-6.0 kernel on an ast2500 platform I'm working on a port to I'm sometimes hitting a panic on busses with an IPMB device on them:
[27559.144928] 8<--- cut here ---
[27559.148067] Unable to handle kernel NULL pointer dereference at virtual address 00000122
[27559.156186] [00000122] *pgd=00000000
[27559.159813] Internal error: Oops: 5 [#1] ARM
[27559.164120] CPU: 0 PID: 0 Comm: swapper Not tainted 6.0.19-67c9407-dirty-766959e #1
[27559.171811] Hardware name: Generic DT based system
[27559.176617] PC is at aspeed_i2c_master_irq+0x10c/0x4d0
[27559.181829] LR is at aspeed_i2c_bus_irq+0x58/0x140
[27559.186657] pc : [<804eea20>] lr : [<804eee3c>] psr: 80000193
[27559.192934] sp : 80b01e70 ip : 00000001 fp : 8095bdb8
[27559.198170] r10: 80b08588 r9 : 80b08a60 r8 : 80dcb3c0
[27559.203403] r7 : 00000001 r6 : 00000001 r5 : 00000001 r4 : 84a84420
[27559.209937] r3 : 00000000 r2 : 00000000 r1 : 9f893440 r0 : 00000122
[27559.216475] Flags: Nzcv IRQs off FIQs on Mode SVC_32 ISA ARM Segment none
[27559.223715] Control: 00c5387d Table: 85490008 DAC: 00000051
[27559.229474] Register r0 information: non-paged memory
[27559.234563] Register r1 information: 0-page vmalloc region starting at 0x9f893000 allocated at __devm_ioremap_resource+0x150/0x1d8
[27559.246372] Register r2 information: NULL pointer
[27559.251118] Register r3 information: NULL pointer
[27559.255856] Register r4 information: slab kmalloc-1k start 84a84400 pointer offset 32 size 1024
[27559.264638] Register r5 information: non-paged memory
[27559.269725] Register r6 information: non-paged memory
[27559.274809] Register r7 information: non-paged memory
[27559.279890] Register r8 information: slab kmalloc-192 start 80dcb3c0 pointer offset 0 size 192
[27559.288585] Register r9 information: non-slab/vmalloc memory
[27559.294279] Register r10 information: non-slab/vmalloc memory
[27559.300056] Register r11 information: non-slab/vmalloc memory
[27559.305836] Register r12 information: non-paged memory
[27559.311002] Process swapper (pid: 0, stack limit = 0x88a4d53e)
[27559.316872] Stack: (0x80b01e70 to 0x80b02000)
[27559.321265] 1e60: 84a84420 80c02520 00000001 00000001
[27559.329464] 1e80: 80dcb3c0 804eee3c 80e5f440 80c02520 00000000 00000024 80dcb3c0 80b08a60
[27559.337663] 1ea0: 80b08588 801522bc 00000012 80c97240 80dcb3c0 80c02520 80c97b4c 80b67050
[27559.345861] 1ec0: 9dfff000 8015244c 80dcb3c0 80c02520 80c97b4c 80156320 0000000c 80151bbc
[27559.354057] 1ee0: 0000000c 803bf368 00001000 a8b1ee02 80b08588 80c024c0 80b8cf54 00000000
[27559.362249] 1f00: 80b01f5c 80151bbc 80c024c0 801011bc 80b01f28 8073c97c 80102a00 60000013
[27559.370447] 1f20: ffffffff 80100b04 00000000 00000000 8443cb49 00000000 00000000 80b08a60
[27559.378648] 1f40: 80b03064 80b08a60 9dfff000 00000000 80b08588 8095bdb8 84b9acc8 80b01f78
[27559.386847] 1f60: 801029fc 80102a00 60000013 ffffffff 00000051 80743398 00000000 80743398
[27559.395044] 1f80: 00000000 80144f40 80b0d748 80ba5040 9dfff018 80145254 80b0d748 8073ca20
[27559.403242] 1fa0: 00000001 80a00c58 00000001 80a0126c ffffffff ffffffff 00000000 80a00658
[27559.411444] 1fc0: 00000000 80a29a48 a8b7ee11 00000000 00000000 80a0033c 00000051 00c0387d
[27559.419642] 1fe0: 00000000 8feea000 410fb767 00c5387d 00000000 00000000 00000000 00000000
[27559.427843] aspeed_i2c_master_irq from aspeed_i2c_bus_irq+0x58/0x140
[27559.434364] aspeed_i2c_bus_irq from __handle_irq_event_percpu+0x50/0x168
[27559.441221] __handle_irq_event_percpu from handle_irq_event+0x30/0x70
[27559.447792] handle_irq_event from handle_simple_irq+0x7c/0xac
[27559.453679] handle_simple_irq from generic_handle_domain_irq+0x2c/0x3c
[27559.460335] generic_handle_domain_irq from aspeed_i2c_ic_irq_handler+0x64/0xf4
[27559.467685] aspeed_i2c_ic_irq_handler from generic_handle_domain_irq+0x2c/0x3c
[27559.475035] generic_handle_domain_irq from avic_handle_irq+0x44/0x60
[27559.481516] avic_handle_irq from generic_handle_arch_irq+0x28/0x3c
[27559.487853] generic_handle_arch_irq from __irq_svc+0x84/0xac
[27559.493645] Exception stack(0x80b01f28 to 0x80b01f70)
[27559.498722] 1f20: 00000000 00000000 8443cb49 00000000 00000000 80b08a60
[27559.506924] 1f40: 80b03064 80b08a60 9dfff000 00000000 80b08588 8095bdb8 84b9acc8 80b01f78
[27559.515114] 1f60: 801029fc 80102a00 60000013 ffffffff
[27559.520187] __irq_svc from arch_cpu_idle+0x30/0x38
[27559.525118] arch_cpu_idle from default_idle_call+0x2c/0x74
[27559.530743] default_idle_call from do_idle+0x70/0xd8
[27559.535848] do_idle from cpu_startup_entry+0x14/0x18
[27559.540943] cpu_startup_entry from rest_init+0x80/0x9c
[27559.546230] rest_init from arch_post_acpi_subsys_init+0x0/0x18
[27559.552226] Code: e59421cc e59411a4 e282c001 e584c1cc (e7d02002)
[27559.558356] ---[ end trace 0000000000000000 ]---
[27559.563002] Kernel panic - not syncing: Fatal exception in interrupt
aspeed_i2c_master_irq+0x10c points to this line, with msg->buf
being NULL.
I've seen it happen a few times, but it's not 100% reproducible, so I'm guessing it maybe a race condition of some sort.
Caught a maybe-related KASAN dump in a similar scenario (though I now think the IPMB device may in fact be unrelated):
[ 614.857168] ==================================================================
[ 614.864585] BUG: KASAN: slab-out-of-bounds in aspeed_i2c_master_irq+0x244/0x708
[ 614.872150] Read of size 2 at addr 8443b6e4 by task swapper/0
[ 614.878047]
[ 614.879645] CPU: 0 PID: 0 Comm: swapper Not tainted 6.0.19-67c9407-dirty-90ab623 #1
[ 614.887447] Hardware name: Generic DT based system
[ 614.892376] unwind_backtrace from show_stack+0x18/0x1c
[ 614.897895] show_stack from dump_stack_lvl+0x24/0x2c
[ 614.903245] dump_stack_lvl from print_report+0x2bc/0x5f8
[ 614.908919] print_report from kasan_report+0xc4/0x108
[ 614.914286] kasan_report from aspeed_i2c_master_irq+0x244/0x708
[ 614.920561] aspeed_i2c_master_irq from aspeed_i2c_bus_irq+0x70/0x188
[ 614.927251] aspeed_i2c_bus_irq from __handle_irq_event_percpu+0x78/0x194
[ 614.934313] __handle_irq_event_percpu from handle_irq_event+0x50/0xb0
[ 614.941076] handle_irq_event from handle_simple_irq+0xc8/0x104
[ 614.947248] handle_simple_irq from generic_handle_domain_irq+0x40/0x50
[ 614.954110] generic_handle_domain_irq from aspeed_i2c_ic_irq_handler+0xe4/0x1bc
[ 614.961809] aspeed_i2c_ic_irq_handler from generic_handle_domain_irq+0x40/0x50
[ 614.969395] generic_handle_domain_irq from avic_handle_irq+0x50/0x74
[ 614.976083] avic_handle_irq from generic_handle_arch_irq+0x28/0x3c
[ 614.982629] generic_handle_arch_irq from call_with_stack+0x18/0x20
[ 614.989153] call_with_stack from __irq_svc+0x78/0x94
[ 614.994423] Exception stack(0x81003f30 to 0x81003f78)
[ 614.999627] 3f20: 8100ce00 00000000 00000000 00000000
[ 615.007960] 3f40: 8100ce00 00000000 81005064 ffffffff 00000000 410fb767 00c5387d 00000000
[ 615.016291] 3f60: 8100ce00 81003f80 801036b4 801036b8 60000013 ffffffff
[ 615.023037] __irq_svc from arch_cpu_idle+0x30/0x38
[ 615.028151] arch_cpu_idle from default_idle_call+0x34/0x7c
[ 615.033977] default_idle_call from do_idle+0x88/0xe4
[ 615.039272] do_idle from cpu_startup_entry+0x14/0x18
[ 615.044548] cpu_startup_entry from rest_init+0xa8/0xc4
[ 615.050037] rest_init from arch_post_acpi_subsys_init+0x0/0x18
[ 615.056208]
[ 615.057807] Allocated by task 262:
[ 615.061326] seq_read_iter+0x424/0x83c
[ 615.065275] io_read+0x1b4/0x81c
[ 615.068706] io_issue_sqe+0x90/0x3ac
[ 615.072495] io_submit_sqes+0x3b8/0x9a8
[ 615.076518] sys_io_uring_enter+0x3fc/0xc6c
[ 615.080887] ret_fast_syscall+0x0/0x54
[ 615.084808]
[ 615.086397] Freed by task 262:
[ 615.089572] kasan_set_free_info+0x20/0x34
[ 615.093867] __kasan_slab_free+0xe4/0x134
[ 615.098049] slab_free_freelist_hook+0x84/0x150
[ 615.102794] kfree+0x90/0x28c
[ 615.105928] seq_release+0x2c/0x48
[ 615.109518] kernfs_fop_release+0x64/0x110
[ 615.113804] __fput+0xd0/0x378
[ 615.117037] task_work_run+0x98/0xc8
[ 615.120810] do_work_pending+0x574/0x698
[ 615.124927] slow_work_pending+0xc/0x20
[ 615.128926]
[ 615.130516] The buggy address belongs to the object at 8443a000
[ 615.130516] which belongs to the cache kmalloc-4k of size 4096
[ 615.142475] The buggy address is located 1764 bytes to the right of
[ 615.142475] 4096-byte region [8443a000, 8443b000)
[ 615.153662]
[ 615.155255] The buggy address belongs to the physical page:
[ 615.160936] page:343b2b83 refcount:1 mapcount:0 mapping:00000000 index:0x0 pfn:0x84438
[ 615.169022] head:343b2b83 order:3 compound_mapcount:0 compound_pincount:0
[ 615.175953] flags: 0x10200(slab|head|zone=0)
[ 615.180432] raw: 00010200 00000100 00000122 814019a0 00000000 00040004 ffffffff 00000001
[ 615.188642] page dumped because: kasan: bad access detected
[ 615.194326]
[ 615.195915] Memory state around the buggy address:
[ 615.200831] 8443b580: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[ 615.207511] 8443b600: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[ 615.214193] >8443b680: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[ 615.220840] ^
[ 615.226633] 8443b700: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[ 615.233309] 8443b780: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[ 615.239957] ==================================================================
aspeed_i2c_master_irq+0x244
points here.
Task 262, FWIW, is adcsensor, doing i2c-unrelated things.
Some further debug-experiment results: after disabling CONFIG_I2C_SLAVE
I don't seem to be able to reproduce it (whereas I had been hitting it pretty reliably with that enabled).
Hello @zevweiss any further updates on this.
We are also seeing similar issue, and have "I2C_SLAVE_ENABLED".
Hi @ksrikanth -- since I realized that the platform I'd been encountering this on didn't actually require CONFIG_I2C_SLAVE
this became a much lower-priority issue for us and I haven't looked into further I'm afraid.