MxGPU-Virtualization
MxGPU-Virtualization copied to clipboard
kernel panic on 4.9.0 for s7150x2
[ 63.725792] gim: loading out-of-tree module taints kernel.
[ 63.728298] gim info:(gim_init:149) Start AMD open source GIM initialization
[ 63.728299] gim info:(gim_init:152) GPU IOV MODULE - version 1.1.4
[ 63.728299] gim info:(gim_init:154) Copyright (c) 2014-2017 Advanced Micro Devices, Inc. All rights reserved.
[ 63.728305] gim info:(parse_config_file:219) AMD GIM fb_option = 0
[ 63.728305] gim info:(parse_config_file:219) AMD GIM sched_option = 0
[ 63.728306] gim info:(parse_config_file:219) AMD GIM vf_num = 0
[ 63.728306] gim info:(parse_config_file:219) AMD GIM pf_fb = 0
[ 63.728306] gim info:(parse_config_file:219) AMD GIM vf_fb = 0
[ 63.728307] gim info:(parse_config_file:219) AMD GIM sched_interval = 0
[ 63.728307] gim info:(parse_config_file:219) AMD GIM sched_interval_us = 0
[ 63.728308] gim info:(parse_config_file:219) AMD GIM fb_clear = 0
[ 63.728308] gim info:(init_config:341) INIT CONFIG
[ 63.773658] gim info:(enumerate_all_pfs:146) pfdev :81d60000
[ 63.773659] gim info:(enumerate_all_pfs:146) pfdev :81d63000
[ 63.773660] dwj pf_count : 2
[ 63.773662] gim info:(set_new_adapter:572) curr allocated at ffffffffc0c05d80
[ 63.773662] gim info:(set_new_adapter:579) SRIOV is supported
[ 63.773665] gim info:(set_new_adapter:587) found PCI bridge device
[ 63.773667] gim info:(set_new_adapter:591) found: 02:8.0
[ 63.773691] gim info:(set_new_adapter:608) mmio_base = ffffaa7388fc0000
[ 63.773696] gim info:(set_new_adapter:610) doorbell = ffffaa7389e00000
[ 63.773697] gim error:(map_fb:369) can't iomap for BAR 0
[ 63.774281] gim info:(set_new_adapter:612) pf.fb_va = (null)
[ 63.774293] gim info:(sriov_is_ari_enabled:164) PCI_SRIOV_CAP = 0x00000002
[ 63.774295] gim info:(sriov_is_ari_enabled:174) PCI_SRIOV_CTRL = 0x00000010
[ 63.774295] gim info:(sriov_is_ari_enabled:177) PCI_SRIOV_CTRL_ARI is set --> ARI is supported
[ 63.774298] gim info:(program_ari_mode:441) Read bif_strap8 = 0x00200004
[ 63.774299] gim info:(program_ari_mode:446) program_ari_mode - Set ARI_Mode = PF_BUS
[ 63.774299] gim info:(program_ari_mode:456) Write bif_strap8 = 0x00000004
[ 63.774300] gim info:(gim_read_rom_from_reg:181) Reading VBios from ROM
[ 63.774419] gim info:(gim_read_vbios:243) VBIOS starts: 0x55, 0xaa
[ 63.774420] gim info:(gim_read_vbios:246) VBios size is 0x10000
[ 63.774429] gim info:(gim_read_vbios:249) vbios allocated at ffffaa7383ac1000
[ 63.774429] gim info:(gim_read_rom_from_reg:181) Reading VBios from ROM
[ 63.911429] gim info:(gim_read_vbios:257) BIOS Version Major 0xF Minor 0x31
[ 63.911458] gim info:(gim_read_vbios:270) Valid video BIOS image,
[ 63.911458] gim info:(gim_read_vbios:271) size = 0x10000, check sum is 0x543c00
[ 63.911464] gim info:(gim_post_vbios:302) Init Parser passed!, continue
[ 63.911467] gim info:(atom_chk_asic_status:333) ATOM_CheckAsicStatus - BIOS_SCRATCH_7 = 0x00000000
[ 63.911467] gim info:(atom_chk_asic_status:336) Isolate ATOM_S7_ASIC_INIT_COMPLETE_MASK bit(s) = 0x00000000
[ 63.911469] gim info:(atom_chk_asic_status:339) RLC_CNTL = 0x00000000
[ 63.911469] gim info:(atom_chk_asic_status:341) Isolate RLC_CNTL__RLC_ENABLE_F32_MASK = 0x00000000
[ 63.911469] gim info:(atom_chk_asic_status:348) ATOM_ASIC_NEED_POST
[ 63.911470] gim info:(gim_post_vbios:305) Asic needs a VBios post
[ 63.911470] gim info:(atom_post_vbios:200) ATOM_PostVBIOS: firmware_info passed
[ 63.911470] gim info:(atom_post_vbios:253) asic_init before, engine clock = 7530; memory clock =1e848
[ 64.233696] gim info:(atom_post_vbios:256) asic_init after
[ 64.233696] gim info:(atom_post_vbios:263) atom_init_fan_cntl before
[ 64.233696] gim info:(atom_post_vbios:265) atom_init_fan_cntl after
[ 64.233697] gim info:(gim_post_vbios:311) Post INIT_ASIC successfully!
[ 64.233708] gim info:(firmware_requires_update:510) SMU option ROM version 0x111700
[ 64.233708] gim info:(firmware_requires_update:511) versus patch version 0x111a00
[ 64.233720] gim info:(firmware_requires_update:521) RLCV option ROM version 113 versus patch version 129
[ 64.233720] gim info:(firmware_requires_update:526) TOC found, update it
[ 64.233721] gim info:(patch_firmware:586) Update smc_init table
[ 64.591918] BUG: unable to handle kernel paging request at 0000000000020000
[ 64.592161] IP: [
[ 64.592635] Oops: 0002 [#1] SMP
[ 64.592863] Modules linked in: gim(OE+) openvswitch(E) nf_conntrack_ipv6(E) nf_nat_ipv6(E) nf_conntrack_ipv4(E) nf_defrag_ipv4(E) nf_nat_ipv4(E) nf_defrag_ipv6(E) nf_nat(E) nf_conntrack(E) libcrc32c(E) crc32c_generic(E) mptctl(E) mptbase(E) ib_iser(E) rdma_cm(E) iw_cm(E) ib_cm(E) ib_core(E) configfs(E) iscsi_tcp(E) libiscsi_tcp(E) libiscsi(E) scsi_transport_iscsi(E) nls_ascii(E) nls_cp437(E) vfat(E) fat(E) snd_hda_codec_hdmi(E) snd_hda_codec_realtek(E) snd_hda_codec_generic(E) i915(E) drm_kms_helper(E) drm(E) intel_rapl(E) i2c_algo_bit(E) x86_pkg_temp_thermal(E) hci_uart(E) snd_hda_intel(E) intel_powerclamp(E) snd_hda_codec(E) btbcm(E) btqca(E) snd_hda_core(E) iTCO_wdt(E) snd_hwdep(E) snd_pcm(E) btintel(E) bluetooth(E) eeepc_wmi(E) asus_wmi(E) coretemp(E) iTCO_vendor_support(E) snd_timer(E) intel_lpss_acpi(E)
[ 64.594497] sparse_keymap(E) psmouse(E) mxm_wmi(E) serio_raw(E) evdev(E) joydev(E) kvm_intel(E) intel_lpss(E) mfd_core(E) efi_pstore(E) i2c_i801(E) video(E) shpchp(E) mei_me(E) mei(E) snd(E) soundcore(E) battery(E) rfkill(E) efivars(E) i2c_smbus(E) kvm(E) irqbypass(E) pcspkr(E) crct10dif_pclmul(E) crc32_pclmul(E) tpm_tis(E) acpi_als(E) ghash_clmulni_intel(E) acpi_pad(E) tpm_tis_core(E) kfifo_buf(E) industrialio(E) tpm(E) wmi(E) button(E) ipmi_watchdog(E) ipmi_poweroff(E) ipmi_devintf(E) ipmi_msghandler(E) fuse(E) autofs4(E) ext4(E) crc16(E) jbd2(E) fscrypto(E) mbcache(E) hid_generic(E) sg(E) usbhid(E) sd_mod(E) crc32c_intel(E) aesni_intel(E) aes_x86_64(E) glue_helper(E) lrw(E) gf128mul(E) ablk_helper(E) cryptd(E) ahci(E) libahci(E) xhci_pci(E) libata(E) xhci_hcd(E) r8169(E) mii(E) usbcore(E) scsi_mod(E)
[ 64.596414] usb_common(E) fan(E) thermal(E) i2c_hid(E) hid(E) fjes(E)
[ 64.597078] CPU: 7 PID: 2331 Comm: insmod Tainted: G OE 4.9.0-0.bpo.1-linx-security-amd64 #1 Linx 4.9.2-2~bpo8+1linx2
[ 64.597852] Hardware name: System manufacturer System Product Name/B365M-KYLIN, BIOS 1202 07/15/2019
[ 64.598236] task: ffff9e8680a33000 task.stack: ffffaa7389758000
[ 64.598620] RIP: 0010:[
Hi, i've got exactly the same issue (same trace). Hardware is an HP dl380 gen8 and a s7150 x2. Os is Proxmox 5.4 (kernel 4.15.18-24-pve). How can i resolve this. Thank you.
Hi,
same issue here on ASUS KGPE-D16 boot with quiet reboot=cold mem=256G rcu_nocbs=0-31 amd_iommu=on iommu=pt pci=realloc enable_mtrr_cleanup=1 video=efifb:off and also with s7150 x2.
Linux a4d8 5.4.73-1-pve #1 SMP PVE 5.4.73-1 (Mon, 16 Nov 2020 10:52:16 +0100) x86_64 GNU/Linux
I also get this messages in dmesg
[ 3.339577] pci 0000:04:00.0: BAR 0: no space for [mem size 0x10000000 64bit pref]
[ 3.339578] pci 0000:04:00.0: BAR 0: failed to assign [mem size 0x10000000 64bit pref]
[ 3.339580] pci 0000:04:00.0: BAR 7: no space for [mem size 0x100000000 64bit pref]
[ 3.339581] pci 0000:04:00.0: BAR 7: failed to assign [mem size 0x100000000 64bit pref]
[ 3.339583] pci 0000:04:00.0: BAR 9: assigned [mem 0xb4400000-0xb83fffff 64bit pref]
[ 3.339587] pci 0000:04:00.0: BAR 12: no space for [mem size 0x04000000]
[ 3.339588] pci 0000:04:00.0: BAR 12: failed to assign [mem size 0x04000000]
[ 3.339589] pci 0000:04:00.0: BAR 2: assigned [mem 0xb4200000-0xb43fffff 64bit pref]
[ 3.339597] pci 0000:04:00.0: BAR 5: no space for [mem size 0x00040000]
[ 3.339598] pci 0000:04:00.0: BAR 5: failed to assign [mem size 0x00040000]
[ 3.339600] pci 0000:04:00.0: BAR 0: no space for [mem size 0x10000000 64bit pref]
[ 3.339602] pci 0000:04:00.0: BAR 0: failed to assign [mem size 0x10000000 64bit pref]
[ 3.339603] pci 0000:04:00.0: BAR 2: assigned [mem 0xb4200000-0xb43fffff 64bit pref]
[ 3.339611] pci 0000:04:00.0: BAR 5: assigned [mem 0xb4400000-0xb443ffff]
[ 3.339615] pci 0000:04:00.0: BAR 12: no space for [mem size 0x04000000]
[ 3.339616] pci 0000:04:00.0: BAR 12: failed to assign [mem size 0x04000000]
[ 3.339617] pci 0000:04:00.0: BAR 9: no space for [mem size 0x04000000 64bit pref]
[ 3.339618] pci 0000:04:00.0: BAR 9: failed to assign [mem size 0x04000000 64bit pref]
[ 3.339620] pci 0000:04:00.0: BAR 7: no space for [mem size 0x100000000 64bit pref]
[ 3.339621] pci 0000:04:00.0: BAR 7: failed to assign [mem size 0x100000000 64bit pref]
I'am using GIM from https://github.com/kasperlewau/MxGPU-Virtualization.