ALVR icon indicating copy to clipboard operation
ALVR copied to clipboard

Crash of CEncoder (VCE: AMF Error 4) when the monitor is connected to an unsupported GPU in a multi-GPU setup

Open scscgit opened this issue 4 years ago • 11 comments

I have been using a desktop with a supported (dual) AMD Radeon (RX580) GPU, which correctly works with ALVR when I have a monitor connected (using DisplayPort) to the AMD GPU, but when I switched the monitor cable to the internal GPU (Intel(R) UHD Graphics 630), which obviously isn't supported by ALVR, I started getting a following error after launching the SteamVR server (which also always switches to the Safe Mode, and has to be switched back manually), in both latest experimental versions 8 and 9:

Failed to initialize CEncoder. All VideoEncoder are not available. VCE: AMF Error 4. m_amfContext->InitDX11(m_d3dRender->GetDevice()), NVENC: Failed to load nvcuda.dll. Please check if NVIDIA graphic driver is installed.

I have two issues with this behavior, as it's a completely normal setup to connect the monitor to an iGPU (e.g. to save power, or to lessen the load on other GPUs - which improves computational performance):

  1. There is literally no other report of this error anywhere, so it took me several weeks before I accidentally realized that this was the source of the issue (and not just some driver updates). I believe this behavior should be documented in the program itself.
  2. If it's possible and easy to implement an enumeration and a selection of a GPU, it would be nice to be able to select a GPU to be used on a multi-GPU setup.

scscgit avatar Mar 02 '20 16:03 scscgit

I'm afraid this is a normal behavior in a multi-monitor, multi-gpu environment and even applies to "normal" VR applications as well. I don't know if there is a way to detect this, but if you try the same setup with the Rift-S, you get an error that the primary display has to be connected to the same gpu as the headset.

I agree that this is not properly documented and the error code should be more precise. Unfortunately, I have no AMD gpu and can not test it any further. The AMF code is the original implementation from polygraphene

JackD83 avatar Mar 02 '20 17:03 JackD83

I've forgot to mention that I'm using Oculus Quest, so I have no idea if this applies to Rift-S. If this is a well known behavior under different (multi-monitor) setups and requires some workarounds, then there's probably nothing else to do except documenting the issue, so that newcomers will know what to do. I believe it'd be enough to just compare the error message with this exact string and provide the additional hint (or a link to this issue). If there's some way to test an alternative solution for an AMD GPU (though to me it seems to be more a matter of Intel iGPU than AMD), I could test it for you (or provide some remote access).

I know this kind of issue should be addressed by polygraphene, but due to that repo's inactivity the only chance is if you (or some other contributor) are experienced in that low-level implementation. I've found this mention in the source, and it references mAdapterIndex and later nDisplayAdapterIndex, so I thought it may be possible to enumerate and expose the list of adapters to a user: https://github.com/polygraphene/ALVR/blob/d21086900a76a28a5fa097e98634e7dcac361397/alvr_server/OpenVRHmd.cpp#L122 // It seems vrcompositor selects always(?) first adapter. vrcompositor may use Intel iGPU when user sets it as primary adapter. I don't know what happens on laptop which support optimus.

scscgit avatar Mar 02 '20 18:03 scscgit

I'm aware that you are using the Quest. Its the only hmd supported by ALVR. What I was trying to tell you is that this error is not an issue of ALVR but a technical limitation of all VR-Headsets!

Monitor on one GPU and hmd on another GPU does not work

The only things that we could change are:

  • Better error message with more details about the error: Would require to implement a check and enumerate monitors and GPUs. As I'm unfamiliar with the APIs for that, this requires a lot of work on my end.
  • Steam crashing and going into save mode: I hope this issue was addressed with commit e23494e785df1673e6a1d83014da483b60890003 and is linked to #151

JackD83 avatar Mar 29 '20 10:03 JackD83

It's too bad if this is a hard limitation of VR headsets. If you do by any chance know which library/module does this exactly depend on (e.g. the VRCompositor mentioned by comments, which may make it a limitation of SteamVR/OpenVR itself), maybe we could at least report it as their own issue. Could they have really missed such a huge flaw in their testing? I skimmed their documentation and this scenario doesn't seem to be addressed whatsoever. These two issues may be related though: "GPU driver limitations" in https://github.com/ValveSoftware/openvr/issues/394 and a mention of Optimus in https://github.com/ValveSoftware/openvr/issues/556.

Related to the simplest hotfix, can't you just compare the error message string directly like I mentioned in my comment, adding some user-friendly explanation on top of it? I believe stuff like "Failed to initialize CEncoder, VCE: AMF Error 4, and NVENC: Failed to load nvcuda.dll" will stay constant. And hopefully, the root of the issue will get fixed before they modify the error messages, if they ever do so ;)

scscgit avatar Mar 29 '20 15:03 scscgit

This is pretty much an edge case people rarely stumble over. At least Oculus is aware of the problem and presents you an error message with an explanation. The "Failed to initialize..." error message is produced by ALVR and describes the result. There are multiple reasons that can cause this (not supported GPU for example) That's why it cannot be simply replaced.

Maybe I extend the error message to include possible reasons.

JackD83 avatar Mar 30 '20 15:03 JackD83

Of course the error message mustn't be replaced, all I intended was for this possible and very likely reason to be appended after or before the message to make it reflect this use-case. Let's make the fix as easy as possible, hoping that someone experienced will properly address this in the dependency library :)

  • As a side-note, I'll very strongly disagree with your opinion that this edge case is something "people rarely stumble over" though, as there's zero feedback to even suggest that this may be causing the issue. You don't get any Windows warning, and you can't find an unambiguous guideline for which exact driver functions "can or cannot work when not connected directly to the monitor". (And it's even worse if one believes that 2x GPU setup has no issue splitting the work between them, because the manufacturers seem to be actually removing this feature instead of improving it.) There are various use-cases for preferring to use monitor with iGPU, e.g. power-saving, more efficient GPU computation (probably even graphical rendering), and there may be other motivations within scenarios like laptops with Optimus. In my case I had personally wasted several weeks debugging this along with multiple other Oculus software issues, like Link randomly not working even with a short cable in a "correct MOBO port", and there's an unceasing stream of hardware issues like voltage drop BSODs that are even more impossible to debug in spite of free voltage monitor software existing on the market. One issue after another, and a person wastes few months doing nothing but looking for dozens of crazy software workarounds, and that's even if you are an experienced power-user, so I can't begin to imagine what the beginners would go through... :)

scscgit avatar Mar 31 '20 10:03 scscgit

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

stale[bot] avatar Apr 30 '20 11:04 stale[bot]

There is PR #187 that implements a better error message, but I wanted to be more general. Will take some extra time

JackD83 avatar May 07 '20 12:05 JackD83

i don't know if it helps but within the amd adrenalin control panel there should be a option "switchable Graphics devices" where it seems you can assign the preferred GPU to an application (ALVR, steamvr). see: https://www.amd.com/en/support/kb/faq/dh-017 and: https://www.amd.com/de/support/kb/faq/dh-004 look for "switchable graphics"

MPfaffe avatar May 30 '20 12:05 MPfaffe

@MPfaffe I doubt this issue is so simple that it could be fixed by switching default GPU using Radeon software; by the way, your links are both outdated, as you can find this on the top of the 2nd page; are you using AMD at all if you haven't yet experienced their new "improved" UI forced by latest drivers?

NOTE! Radeon Additional Settings was retired in Radeon Software Crimson ReLive Edition 17.7.2. Its previously supported controls for AMD Eyefinity, Switchable Graphics, Color Depth, Pixel Format, and Power are now available in Radeon Settings. For more information, please visit: Radeon™ Software Help Center

I assume this is the correct configuration, as the Global Graphics button configures the Gaming profile, where I can switch between GPU 1 and GPU 2 (but not to the internal GPU) - it's getting less intuitive by every update. I'm sure I've tried several such settings already without success (but the issue is solved by switching the monitor cable anyway):

image

Also, it's interesting that when I tried it now, I started getting a new error (after getting the original error once before setting anything):

Could not create graphics device for adapter 0. Requires a minimum of two graphics cards.

image

This error also persists even if I remove ALVR from the list of games and restart it... If anyone feels like debugging other scenarios, don't forget to post your findings ;)

scscgit avatar May 30 '20 13:05 scscgit

Ping?

Vixea avatar Mar 18 '23 05:03 Vixea