server
server copied to clipboard
Bug with latest AMD PRO drivers 22Q4
Expected behaviour
Just works.
Current behaviour
Crashes on the first call to caspar::accelerator::ogl::texture::impl::copy_to
Steps to reproduce
- Install AMD Pro drivers 22Q4
- Start CasparCG server and load a color (LOAD RED)
Environment
- Commit: 2.3.0, 2.3.3 and master
- Server version: [e.g. v2.2]
- Operating system: [e.g. Windows 11]
I played a bit with the code here and it seems that its due to the format / type not working properly.
When I change format to GL_RGBA and type to GL_UNSIGNED_BYTE I get results but Red and Green channels are swapped. But any call with GL_BGRA
Guess it's a driver issue but maybe there is a workaround in CasparCG. Reported that with the AMD driver software 🤞🏼
Just want to bup this issue, it still occurs with the latest AMD drivers. Effectivly CasparCG is unusable right now with an AMD graphics card if the drivers are newer than 2020 or so.
Seems fine on ubuntu 22.04 OpenGL 4.6 (Core Profile) Mesa 22.2.5 AMD
using the onboard gpu from a 7950x is running without issue.
Unless I install windows on this machine to figure out this one bug, I am not able to do anything on this myself.
When I change format to GL_RGBA and type to GL_UNSIGNED_BYTE I get results but Red and Green channels are swapped. But any call with GL_BGRA
I wonder if both of those changes are necessary?
changing GL_BGRA
to GL_RGBA
will likely have large implications elsewhere. Such as the decklink driver accepts BGRA
or ARGB
, so while we could probably composite in, we would have to convert it to BGRA at some point.
It sounds like GL_UNSIGNED_INT_8_8_8_8_REV
vs GL_UNSIGNED_BYTE
could have no impact. Based on https://stackoverflow.com/questions/7786187/opengl-texture-upload-unsigned-byte-vs-unsigned-int-8-8-8-8, it looks like it is a performance optimisation, but as all the architectures we may want to run on are little-endian, changing it might have no effect?
https://github.com/renpy/renpy/issues/16 backs up that suspicion of being a performance optimisation, the one source link still working (apple) says that GL_RGBA
and GL_UNSIGNED_BYTE
, but doesnt say if that extends to GL_BGRA
So if someone can confirm whether this works with GL_BGRA
and GL_UNSIGNED_BYTE
on these AMD GPUs, then it should be possible to make that change.
So this breaks on windows, changing the texture type to GL_RGBA
, including in the screen consumer to fix the colours to GL_RGBA
while I was testing seems to fix this. I came to ask if there is any reason why GL_BGRA
is used but you have answered that.
Changing to GL_UNSIGNED_BYTE
from GL_UNSIGNED_INT_8_8_8_8_REV
does not solve the problem. It does seem like the AMD driver is bugged only with GL_BGRA
and from what I can tell GL_BGR
as well.
To note there isn't an opengl error thrown, the AMD driver itself throws an exception.
I am testing on Windows 11, it was the same on windows 10 when I was still running that.
To note, just changing to GL_RGBA
and GL_RGB
fixed the crash for me, I did not need to change GL_UNSIGNED_INT_8_8_8_8_REV
I have also tried GetTexInfo and Getntexinfo? the other two APIs and they also crash.
Seems like something deep in the AMD driver is broken with BGRA on windows.
Will want to check if writing to the texture crashes as well.
Will check that and will see if there is any other way to do the copy.
Maybe the OpenCL dream may yet come
So it's only these calls that fail https://registry.khronos.org/OpenGL-Refpages/gl4/html/glGetTexImage.xhtml, the glTextureSubImage2D call was fine.
Confirming that this is apparently still an issue, although all I can see is that CasparCG quits suddenly and Windows Event Viewer shows the following. Happens no matter what the consumer is and no matter if it's a media file or simply instructing CasparCG to output a colour. At least for media I can see that ffmpeg gets to the point where it probably would start outputting image data and bang, it's quit. I can't get any more debug info out of CasparCG at this time:
Faulting application name: casparcg.exe, version: 2.3.2.0, time stamp: 0x604fb45a Faulting module name: ntdll.dll, version: 10.0.19041.3393, time stamp: 0xfeef31d3 Exception code: 0xc0000374
My system:
- Windows 10 IoT 21H2 LTSC
- Ryzen 5 2400G with AMD Radeon(TM) RX Vega 11 Graphics
- Driver 23.19.02-230831a-396094C-AMD-Software-Adrenalin-Edition
Full version details:
APU - AMD Radeon(TM) RX Vega 11 Graphics - Primary/Integrated VRAM - 2048 MB - DDR4 1467 MHz Driver Version - 23.19.02-230831a-396094C-AMD-Software-Adrenalin-Edition AMD Windows Driver Version - 31.0.21902.5 Direct3D API Version - 12.1 Vulkan™ API Version - 1.3.260 OpenCL™ API Version - 2.0 OpenGL® API Version - 4.6 Direct3D® Driver Version - 9.14.10.01526 Vulkan™ Driver Version - 2.0.279 OpenCL® Driver Version - 31.0.21902.5 OpenGL® Driver Version - 23.08.230729_569461f 2D Driver Version - 8.1.1.1634 2D Driver File Path - /REGISTRY/MACHINE/SYSTEM/CurrentControlSet/Control/Class/{4d36e968-e325-11ce-bfc1-08002be10318}/0000 UI Version - 2023.0831.1020.1996 AMD Audio Driver Version - 10.0.1.23 Driver Provider - Advanced Micro Devices, Inc. Windows Edition - Windows 10 EnterpriseSN (64 bit) Windows Version - 21H2
Using GLView I can see GL_EXT_abgr" and "GL_EXT_bgra" listed under extensions, but how that relates to anything I have no idea.