New kernel (>4.x) generates requests with NULL bio
Stackbd tries to clone it but panicking with NULL pointer exception.
Thanks for noting this. I think I have run into the same issue, but with kernel 3.17. Do you have a fix that handles this?
This appears to be reproducible with 3.13 even (centos 7.5, kernel 3.13.0, MLNX OFED 3.4).
Below is the kernel oops right after swapon /dev/infiniswap0. Sometimes this error doesn't happen until the swap device is being used, but always very quickly after 10-100MB have been swapped out.
[ 728.619403] In IS_session_create() with portal: rdma://1,10.10.10.4:9400,
[ 728.626210] rdma://1,10.10.10.4:9400,
[ 728.629876] portal: 10.10.10.4, 9400
[ 733.985157] IS_register_block_device, dev_name infiniswap0
[ 733.990649] IS: init done
[ 733.993432] stackbd: init done
[ 733.996501] Opened /dev/loop0
[ 733.999476] stackbd: Device real capacity: 104857600
[ 734.004440] stackbd: Max sectors: 8
[ 734.007965] stackbd: done initializing successfully
[ 748.056376] evict_handler, waiting for STOP msg
[ 752.832599] BUG: unable to handle kernel NULL pointer dereference at 0000000000000068
[ 752.840455] IP: [] bio_clone_bioset+0x11/0x70
[ 752.846388] PGD 7b6675067 PUD 78baba067 PMD 0
[ 752.850882] Oops: 0000 [#1] PREEMPT SMP
[ 752.854855] Modules linked in: infiniswap(OF) xt_CT xt_mac xt_comment xt_physdev xt_set ip_set_hash_net ip_set iptable_raw xt_CHECKSUM iptable_mangle ipt_REJECT rbd libceph ebtable_filter ebtables xt_nat xt_tcpudp openvswitch gre ipt_MASQUERADE nf_conntrack_netlink nf
netlink xfrm_user xfrm_algo iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 xt_addrtype iptable_filter xt_conntrack nf_nat nf_conntrack veth rdma_ucm(OF) ib_ucm(OF) rdma_cm(OF) iw_cm(OF) configfs ib_ipoib(OF) ib_cm(OF) ib_uverbs(OF) ib_umad(OF) mlx5_ib(OF) mlx5_
core(OF) iTCO_wdt gpio_ich iTCO_vendor_support mlx4_en(OF) mlx4_ib(OF) ib_sa(OF) ib_mad(OF) ib_core(OF) ib_addr(OF) ib_netlink(OF) ipmi_devintf coretemp x86_pkg_temp_thermal kvm_intel kvm dm_thin_pool dm_persistent_data crct10dif_pclmul crc32_pclmul dm_bufio ghash_clmuln
i_intel dm_bio_prison aesni_intel aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd vxlan ip_tunnel lpc_ich i2c_i801 pcspkr mfd_core shpchp joydev wmi ipmi_si ipmi_msghandler acpi_power_meter evbug mlx4_core(OF) mlx_compat(OF) ip_tables x_tables nfsv3 nfs_acl nfs lo
ckd fscache bridge stp llc hid_generic igb usbkbd usbmouse usbhid hid i2c_algo_bit dca hwmon ahci ptp libahci pps_core sunrpc ipv6 autofs4
[ 752.963065] CPU: 4 PID: 104 Comm: kswapd0 Tainted: GF O 3.13.0-scaleos #1
[ 752.970627] Hardware name: Supermicro SYS-1028TR-TF/X10DRT-LIBF, BIOS 2.0 12/17/2015
[ 752.978365] task: ffff88105b9a1a10 ti: ffff88085be54000 task.ti: ffff88085be54000
[ 752.985842] RIP: 0010:[] [] bio_clone_bioset+0x11/0x70
[ 752.994201] RSP: 0018:ffff88085be55578 EFLAGS: 00010246
[ 752.999511] RAX: ffff88104fa4b000 RBX: ffff88104fb9d728 RCX: 0000000000000000
[ 753.006639] RDX: ffff88085c143800 RSI: 0000000000000020 RDI: 0000000000000000
[ 753.013769] RBP: ffff88085be55590 R08: ffff8807a68beee0 R09: 0000000017027000
[ 753.020898] R10: ffffc90020192480 R11: 00000000000141c0 R12: 0000000000000000
[ 753.028028] R13: 0000000000000020 R14: ffff88104fb9d728 R15: ffff880859f9ca80
[ 753.035159] FS: 0000000000000000(0000) GS:ffff88085fd00000(0000) knlGS:0000000000000000
[ 753.043242] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 753.048983] CR2: 0000000000000068 CR3: 00000007b9c08000 CR4: 00000000003407e0
[ 753.056113] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 753.063243] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 753.070372] Stack:
[ 753.072383] ffff88104fb9d728 0000000000000000 0000000000000000 ffff88085be555b0
[ 753.079845] ffffffffa06ce93c 0000000000000000 ffff88104fb9d738 ffff88085be55618
[ 753.087304] ffffffffa06ceb9b ffff880074989000 0000000100000000 0000000017027000
[ 753.094763] Call Trace:
[ 753.097213] [] stackbd_bio_generate+0x2c/0xa0 [infiniswap]
[ 753.104349] [] IS_rdma_write+0x14b/0x1f0 [infiniswap]
[ 753.111044] [] IS_transfer_chunk+0x74/0xd0 [infiniswap]
[ 753.117915] [] IS_queue_rq+0x224/0x390 [infiniswap]
[ 753.124438] [] __blk_mq_run_hw_queue+0x1c3/0x3f0
[ 753.130700] [] blk_mq_run_hw_queue+0x35/0x40
[ 753.136613] [] blk_mq_insert_requests+0xba/0x140
[ 753.142877] [] blk_mq_flush_plug_list+0x129/0x140
[ 753.149227] [] blk_flush_plug_list+0xd9/0x230
[ 753.155228] [] blk_mq_make_request+0x37a/0x4e0
[ 753.161316] [] generic_make_request+0xc2/0x110
[ 753.167403] [] submit_bio+0x71/0x150
[ 753.172628] [] ? test_set_page_writeback+0x115/0x180
[ 753.179235] [] __swap_writepage+0x164/0x210
[ 753.185066] [] ? _raw_spin_lock+0x17/0x60
[ 753.190719] [] ? _raw_spin_unlock+0x1c/0x60
[ 753.196548] [] ? page_swapcount+0x4c/0x60
[ 753.202202] [] swap_writepage+0x39/0x70
[ 753.207685] [] shmem_writepage+0x198/0x2d0
[ 753.213427] [] shrink_page_list+0x47b/0x9f0
[ 753.219255] [] shrink_inactive_list+0x228/0x4c0
[ 753.225432] [] shrink_lruvec+0x4d1/0x650
[ 753.230999] [] shrink_zone+0x31/0x100
[ 753.236308] [] balance_pgdat+0x386/0x5b0
[ 753.241874] [] kswapd+0x156/0x440
[ 753.246837] [] ? prepare_to_wait_event+0x100/0x100
[ 753.253274] [] ? balance_pgdat+0x5b0/0x5b0
[ 753.259016] [] kthread+0xc9/0xe0
[ 753.263888] [] ? kthread_create_on_node+0x190/0x190
[ 753.270412] [] ret_from_fork+0x7c/0xb0
[ 753.275806] [] ? kthread_create_on_node+0x190/0x190
[ 753.282325] Code: 41 5c 5d c3 66 0f 1f 44 00 00 48 89 df e8 e8 62 fb ff eb 92 0f 0b 0f 1f 40 00 0f 1f 44 00 00 55 48 89 e5 41 55 41 89 f5 41 54 53 <8b> 77 68 48 89 fb 44 89 ef e8 21 fa ff ff 48 85 c0 49 89 c4 74
[ 753.302303] RIP [] bio_clone_bioset+0x11/0x70
[ 753.308338] RSP
[ 753.311824] CR2: 0000000000000068
If I check for (req->bio == NULL) before calling bio_clone(), I get a hit, but kernel module still fails because the request is not handled.