gtp5g
gtp5g copied to clipboard
Kernel panic when running 10x ran.sh in parallel
Hi all,
I'm having a kernel panic when running 10x this script in parallel (with 10 UEs, 10 different gNB IPs, 10 differents GTP-U interfaces): https://github.com/free5gc/libgtp5gnl/blob/master/script/ran.sh but using the go binaries from https://github.com/free5gc/go-gtp5gnl (with a tweak here: https://github.com/free5gc/go-gtp5gnl/blob/4f36b49ab7f7f90632b0981aa832121438e5a243/cmd/gogtp5g-link/main.go#L72 so an interface bind only to a single specified IP address instead of binding to all IP addresses).
Adding a small sleep of 20 ms before launching the 2nd, 3rd.. scripts workaround the issue.
According to the kernel panic, issue seems to lie inside gtp5g_genl_add_pdr, you'll find the Kernel panic logs at the end of this post. If I can be of any help, don't hesitate, thank you!! Also quick question, is it possible to have multiple gtp5g interface on the same IP address/port? Thanks!
[ 57.441597] BUG: kernel NULL pointer dereference, address: 0000000000000080
[ 57.442989] #PF: supervisor write access in kernel mode
[ 57.443983] #PF: error_code(0x0002) - not-present page
[ 57.444965] PGD 802150067 P4D 802150067 PUD 819cfd067 PMD 0
[ 57.446041] Oops: 0002 [#1] SMP NOPTI
[ 57.446749] CPU: 20 PID: 2009 Comm: app Tainted: G OE 5.4.0-152-generic #169-Ubuntu
[ 57.448424] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 11/12/2020
[ 57.450438] RIP: 0010:gtp5g_genl_add_pdr+0x185/0x270 [gtp5g]
[ 57.451513] Code: 01 00 00 be 20 0b 00 00 e8 88 0b f9 e3 49 89 c5 48 85 c0 0f 84 e2 00 00 00 49 8b 54 24 10 b8 01 00 00 00 48 8d ba 80 00 00 00 <f0> 0f c1 82 80 00 00 00 85 c0 74 69 78 5b 83 c0 01 78 56 49 8b 44
[ 57.455015] RSP: 0018:ffffae83c797ba10 EFLAGS: 00010286
[ 57.456013] RAX: 0000000000000001 RBX: ffffae83c797baa8 RCX: 0000000000000000
[ 57.457366] RDX: 0000000000000000 RSI: ffffffffc090cc58 RDI: 0000000000000080
[ 57.458716] RBP: ffffae83c797ba50 R08: ffff9d421fa35140 R09: ffff9d3a1f406d80
[ 57.460064] R10: ffff9d421ab92e00 R11: 0000000000000011 R12: ffff9d3a0360d8c0
[ 57.461414] R13: ffff9d421ab92e00 R14: 0000000000000000 R15: ffff9d421ab90c00
[ 57.462763] FS: 00007f9d5cff9700(0000) GS:ffff9d421fa00000(0000) knlGS:0000000000000000
[ 57.464290] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 57.465381] CR2: 0000000000000080 CR3: 0000000802254005 CR4: 00000000007606e0
[ 57.466760] PKRU: 55555554
[ 57.467292] Call Trace:
[ 57.467779] genl_family_rcv_msg+0x1b9/0x470
[ 57.468601] genl_rcv_msg+0x4c/0xa0
[ 57.469278] ? _cond_resched+0x19/0x30
[ 57.470002] ? genl_family_rcv_msg+0x470/0x470
[ 57.470853] netlink_rcv_skb+0x50/0x120
[ 57.471588] genl_rcv+0x29/0x40
[ 57.472196] netlink_unicast+0x1a8/0x250
[ 57.472949] netlink_sendmsg+0x240/0x480
[ 57.473706] ? __check_object_size+0x4d/0x150
[ 57.474541] sock_sendmsg+0x65/0x70
[ 57.475215] ____sys_sendmsg+0x212/0x280
[ 57.476660] ___sys_sendmsg+0x88/0xd0
[ 57.478060] ? iput+0x148/0x210
[ 57.479356] ? _cond_resched+0x19/0x30
[ 57.480738] ? get_max_files+0x20/0x20
[ 57.482095] __sys_sendmsg+0x5c/0xa0
[ 57.483413] __x64_sys_sendmsg+0x1f/0x30
[ 57.484786] do_syscall_64+0x57/0x190
[ 57.486107] entry_SYSCALL_64_after_hwframe+0x5c/0xc1
[ 57.487677] RIP: 0033:0x40436e
[ 57.488858] Code: 48 89 6c 24 38 48 8d 6c 24 38 e8 0d 00 00 00 48 8b 6c 24 38 48 83 c4 40 c3 cc cc cc 49 89 f2 48 89 fa 48 89 ce 48 89 df 0f 05 <48> 3d 01 f0 ff ff 76 15 48 f7 d8 48 89 c1 48 c7 c0 ff ff ff ff 48
[ 57.494126] RSP: 002b:000000c000199740 EFLAGS: 00000206 ORIG_RAX: 000000000000002e
[ 57.496163] RAX: ffffffffffffffda RBX: 0000000000000078 RCX: 000000000040436e
[ 57.498091] RDX: 0000000000000000 RSI: 000000c000199870 RDI: 0000000000000078
[ 57.500011] RBP: 000000c000199780 R08: 0000000000000000 R09: 0000000000000000
[ 57.501921] R10: 0000000000000000 R11: 0000000000000206 R12: 000000c000199938
[ 57.503831] R13: 0000000000000000 R14: 000000c0005816c0 R15: 000000c000067800
[ 57.505731] Modules linked in: vrf sctp 8021q garp mrp stp llc vmw_vsock_vmci_transport vsock dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua binfmt_misc intel_rapl_msr vmw_balloon intel_rapl_common isst_if_mbox_msr isst_if_common joydev input_leds nfit rapl serio_raw vmw_vmci mac_hid sch_fq_codel gtp5g(OE) udp_tunnel msr ramoops reed_solomon efi_pstore ip_tables x_tables autofs4 btrfs zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear hid_generic usbhid hid vmwgfx ttm crct10dif_pclmul drm_kms_helper crc32_pclmul ghash_clmulni_intel syscopyarea aesni_intel sysfillrect sysimgblt fb_sys_fops crypto_simd mptspi cryptd mptscsih glue_helper bnxt_en psmouse drm ahci mptbase vmxnet3 i2c_piix4 libahci scsi_transport_spi pata_acpi
[ 57.524746] CR2: 0000000000000080
[ 57.526104] ---[ end trace d4fa568a26f72f9a ]---
[ 57.527702] RIP: 0010:gtp5g_genl_add_pdr+0x185/0x270 [gtp5g]
[ 57.529497] Code: 01 00 00 be 20 0b 00 00 e8 88 0b f9 e3 49 89 c5 48 85 c0 0f 84 e2 00 00 00 49 8b 54 24 10 b8 01 00 00 00 48 8d ba 80 00 00 00 <f0> 0f c1 82 80 00 00 00 85 c0 74 69 78 5b 83 c0 01 78 56 49 8b 44
[ 57.535267] RSP: 0018:ffffae83c797ba10 EFLAGS: 00010286
[ 57.537051] RAX: 0000000000000001 RBX: ffffae83c797baa8 RCX: 0000000000000000
[ 57.539202] RDX: 0000000000000000 RSI: ffffffffc090cc58 RDI: 0000000000000080
[ 57.541344] RBP: ffffae83c797ba50 R08: ffff9d421fa35140 R09: ffff9d3a1f406d80
[ 57.543531] R10: ffff9d421ab92e00 R11: 0000000000000011 R12: ffff9d3a0360d8c0
[ 57.545695] R13: ffff9d421ab92e00 R14: 0000000000000000 R15: ffff9d421ab90c00
[ 57.547854] FS: 00007f9d5cff9700(0000) GS:ffff9d421fa00000(0000) knlGS:0000000000000000
[ 57.550205] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 57.552138] CR2: 0000000000000080 CR3: 0000000802254005 CR4: 00000000007606e0
[ 57.554361] PKRU: 55555554
Hi @linouxis9
Would you please help to provide more information? for example:
- Version of gtp5g
- Can you reproduce kernel panic by using go-gtp5gnl? (libgtp5gnl has been archived)
Hi @ianchen0119,
Thank you for your message! I was indeed using go-gtp5gnl to trigger the kernel panic, and not libgtp5gnl. I've used this script https://github.com/free5gc/libgtp5gnl/blob/master/script/ran.sh from libgtp5gnl, but by replacing inside the scripts the gtp5g-link/gtp5g-tunnel binaries from libgtp5gnl, with the gtp5g-link/gtp5g-tunnel go binaries from go-gtp5gnl here: https://github.com/free5gc/go-gtp5gnl/tree/main/cmd.
I had done my testing on gtp5g's commit 3f425930aa6e972f3f4c5f78b7bdaf0518574101.
Thanks a lot!