"NVRM RmInitAdapter: Cannot initialize GSP firmware RM" error found
NVIDIA Open GPU Kernel Modules Version
520.56.06
Please confirm this issue does not happen with the proprietary driver (of the same version). This issue tracker is only for bugs specific to the open kernel driver.
- [ ] I confirm that this does not happen with the proprietary driver package.
Operating System and Version
Ubuntu 20.04.6 LTS
Kernel Release
5.10.14
Please confirm you are running a stable release kernel (e.g. not a -rc). We do not accept bug reports for unreleased kernels.
- [X] I am running on a stable kernel release.
Hardware: GPU
NVIDIA GeForce RTX 3080
Describe the bug
We have deployed a ubuntu machine with an Open GPU Kernel Modules 520 nvidia driver. But the machine often has some exceptions. The error is as follows:
NVRM s_executeBooterUcode_TU102: Booter failed with non-zero error code: 0xa 2024-07-02 18:46:08.681559 kernel:[ 21.766727] NVRM kgspExecuteBooterUnloadIfNeeded_TU102: failed to execute Booter Unload: 0xffff 2024-07-02 18:46:08.681562 kernel:[ 21.766734] NVRM nvAssertFailedNoLog: Assertion failed: rmStatus == NV_OK @ osinit.c:1982
ecuteFwsecFrts_HAL(pGpu, pKernelGsp, pKernelGsp->pFwsecUcode, pKernelGsp->pWprMeta->frtsOffset) @ kernel_gsp_ga102.c:164 [ 1731.589314] NVRM nvAssertFailedNoLog: Assertion failed: status == NV_OK @ kernel_gsp_ga102.c:235 [ 1731.589317] NVRM kgspInitRm_IMPL: cannot bootstrap riscv/gsp: 0xffff [ 1731.589323] NVRM RmInitAdapter: Cannot initialize GSP firmware RM [ 1731.591779] NVRM: GPU 0000:86:00.0: RmInitAdapter failed! (0x63:0xffff:1684) [ 1731.593977] NVRM: GPU 0000:86:00.0: rm_init_adapter failed, device minor number 0 [ 1731.777872] NVRM s_executeBooterUcode_TU102: Booter failed with non-zero error code: 0xa [ 1731.777876] NVRM kgspExecuteBooterUnloadIfNeeded_TU102: failed to execute Booter Unload: 0xffff [ 1731.800951] NVRM s_executeFwsec_TU102: failed to execute FWSEC for FRTS: FRTS error code 0xbe [ 1731.800957] NVRM nvAssertOkFailedNoLog: Assertion failed: Failure: Generic Error [NV_ERR_GENERIC] (0x0000FFFF) returned from kgspExecuteFwsecFrts_HAL(pGpu, pKernelGsp, pKernelGsp->pFwsecUcode, pKernelGsp->pWprMeta->frtsOffset) @ kernel_gsp_ga102.c:164 [ 1731.800963] NVRM nvAssertFailedNoLog: Assertion failed: status == NV_OK @ kernel_gsp_ga102.c:235 [ 1731.800965] NVRM kgspInitRm_IMPL: cannot bootstrap riscv/gsp: 0xffff [ 1731.800970] NVRM RmInitAdapter: Cannot initialize GSP firmware RM [ 1731.803388] NVRM: GPU 0000:af:00.0: RmInitAdapter failed! (0x63:0xffff:1684) [ 1731.805517] NVRM: GPU 0000:af:00.0: rm_init_adapter failed, device minor number 1 [ 1731.989155] NVRM s_executeBooterUcode_TU102: Booter failed with non-zero error code: 0xa [ 1731.989160] NVRM kgspExecuteBooterUnloadIfNeeded_TU102: failed to execute Booter Unload: 0xffff [ 1732.012716] NVRM s_executeFwsec_TU102: failed to execute FWSEC for FRTS: FRTS error code 0xbe [ 1732.012722] NVRM nvAssertOkFailedNoLog: Assertion failed: Failure: Generic Error [NV_ERR_GENERIC] (0x0000FFFF) returned from kgspExecuteFwsecFrts_HAL(pGpu, pKernelGsp, pKernelGsp->pFwsecUcode, pKernelGsp->pWprMeta->frtsOffset) @ kernel_gsp_ga102.c:164
To Reproduce
Using 520.56.06 open-source nvidia driver and starting the machine
Bug Incidence
Sometimes
nvidia-bug-report.log.gz
NVRM s_executeBooterUcode_TU102: Booter failed with non-zero error code: 0xa 2024-07-02 18:46:08.681559 kernel:[ 21.766727] NVRM kgspExecuteBooterUnloadIfNeeded_TU102: failed to execute Booter Unload: 0xffff 2024-07-02 18:46:08.681562 kernel:[ 21.766734] NVRM nvAssertFailedNoLog: Assertion failed: rmStatus == NV_OK @ osinit.c:1982
ecuteFwsecFrts_HAL(pGpu, pKernelGsp, pKernelGsp->pFwsecUcode, pKernelGsp->pWprMeta->frtsOffset) @ kernel_gsp_ga102.c:164 [ 1731.589314] NVRM nvAssertFailedNoLog: Assertion failed: status == NV_OK @ kernel_gsp_ga102.c:235 [ 1731.589317] NVRM kgspInitRm_IMPL: cannot bootstrap riscv/gsp: 0xffff [ 1731.589323] NVRM RmInitAdapter: Cannot initialize GSP firmware RM [ 1731.591779] NVRM: GPU 0000:86:00.0: RmInitAdapter failed! (0x63:0xffff:1684) [ 1731.593977] NVRM: GPU 0000:86:00.0: rm_init_adapter failed, device minor number 0 [ 1731.777872] NVRM s_executeBooterUcode_TU102: Booter failed with non-zero error code: 0xa [ 1731.777876] NVRM kgspExecuteBooterUnloadIfNeeded_TU102: failed to execute Booter Unload: 0xffff [ 1731.800951] NVRM s_executeFwsec_TU102: failed to execute FWSEC for FRTS: FRTS error code 0xbe [ 1731.800957] NVRM nvAssertOkFailedNoLog: Assertion failed: Failure: Generic Error [NV_ERR_GENERIC] (0x0000FFFF) returned from kgspExecuteFwsecFrts_HAL(pGpu, pKernelGsp, pKernelGsp->pFwsecUcode, pKernelGsp->pWprMeta->frtsOffset) @ kernel_gsp_ga102.c:164 [ 1731.800963] NVRM nvAssertFailedNoLog: Assertion failed: status == NV_OK @ kernel_gsp_ga102.c:235 [ 1731.800965] NVRM kgspInitRm_IMPL: cannot bootstrap riscv/gsp: 0xffff [ 1731.800970] NVRM RmInitAdapter: Cannot initialize GSP firmware RM [ 1731.803388] NVRM: GPU 0000:af:00.0: RmInitAdapter failed! (0x63:0xffff:1684) [ 1731.805517] NVRM: GPU 0000:af:00.0: rm_init_adapter failed, device minor number 1 [ 1731.989155] NVRM s_executeBooterUcode_TU102: Booter failed with non-zero error code: 0xa [ 1731.989160] NVRM kgspExecuteBooterUnloadIfNeeded_TU102: failed to execute Booter Unload: 0xffff [ 1732.012716] NVRM s_executeFwsec_TU102: failed to execute FWSEC for FRTS: FRTS error code 0xbe [ 1732.012722] NVRM nvAssertOkFailedNoLog: Assertion failed: Failure: Generic Error [NV_ERR_GENERIC] (0x0000FFFF) returned from kgspExecuteFwsecFrts_HAL(pGpu, pKernelGsp, pKernelGsp->pFwsecUcode, pKernelGsp->pWprMeta->frtsOffset) @ kernel_gsp_ga102.c:164
More Info
No response
I think, you should try this also with newer versions, since 520 is not supported anymore.
There are:
- 535 Production Stable
- 550 Stable
- 555 New Feature
Branches.
7 RmInitAdapter: Cannot initialize GSP firmware RM 6 [ 387.346751] NVRM: GPU 0000:01:00.0: RmInitAdapter failed! (0x62:0x56:1993) 5 [ 387.355516] NVRM: GPU 0000:01:00.0: rm_init_adapter failed, device minor number 0 4 [ 387.562971] NVRM: nvAssertFailed: Assertion failed: 0 @ g_kernel_sec2_nvoc.h:792 3 [ 387.562991] NVRM: nvAssertFailedNoLog: Assertion failed: pBinArchive != NULL @ kernel_gsp_booter.c:487 2 [ 387.562998] NVRM: nvCheckOkFailedNoLog: Check failed: Call not supported [NV_ERR_NOT_SUPPORTED] (0x00000056) returned from kgspAllocateScrubberUcodeImage(pGpu, p KernelGsp, &pKernelGsp->pScrubberUcode) @ kernel_gsp.c:3486 1 [ 387.563000] NVRM: nvCheckOkFailedNoLog: Check failed: Call not supported [NV_ERR_NOT_SUPPORTED] (0x00000056) returned from _kgspPrepareScrubberImageIfNeeded(pGpu , pKernelGsp) @ kernel_gsp.c:3635
Saw this thread, I'm facing similar issues I'm using 580.65.06 I'm using a rk3588 sbc axon , with discrete rtx3080.
Dear @hrushirajg23 Thank you for reporting issue, could you please help to generate bug report in repro state and attach for triage purpose.
@amrit1711
- the open driver -
- proprietary driver -
Regarding the open - driver, I solved the chipset not recognized by adding my chip info. I'm clueless about the "Cannot initialize gsp firmware RM issue".
Do let me know if you need any more information. Thanks
Thank you, we will analyze logs and get back to you.