rr icon indicating copy to clipboard operation
rr copied to clipboard

Support recording applications that use GPU via GL

Open rocallahan opened this issue 4 years ago • 7 comments

One way to do this would be to pick one or more open-source GPU drivers, study their kernel/user interface, and support that directly in rr. VMWare's SVGA3D or QEMU's Virgil might be a good choice.

Another way, maybe better, would be to adapt https://virtualgl.org/ route GL calls through a pipe and record that.

rocallahan avatar Apr 15 '20 13:04 rocallahan

Fedora packages VirtualGL. Hopefully Ubuntu does too. That helps.

Trying vglrun rr record glxspheres64, rr dies on a DRM ioctl with this stack:

#0  0x00007f9260b6a359 in ioctl () at ../sysdeps/unix/syscall-template.S:78
#1  0x00007f926046e260 in drmIoctl () from /lib64/libdrm.so.2
#2  0x00007f926046e40b in drmGetVersion () from /lib64/libdrm.so.2
#3  0x00007f9260509faf in loader_get_kernel_driver_name ()
   from /lib64/libGLX_mesa.so.0
#4  0x00007f926050a6d8 in loader_get_driver_for_fd ()
   from /lib64/libGLX_mesa.so.0
#5  0x00007f92605002c1 in dri3_create_screen () from /lib64/libGLX_mesa.so.0
#6  0x00007f92604ecc29 in __glXInitialize () from /lib64/libGLX_mesa.so.0
#7  0x00007f92604e8926 in glXGetFBConfigs () from /lib64/libGLX_mesa.so.0
#8  0x00007f92604ea0c8 in glXChooseFBConfigSGIX () from /lib64/libGLX_mesa.so.0
#9  0x00007f9260751f6d in glXChooseFBConfig () from /lib64/libGLX.so.0
#10 0x00007f926105868f in _glXChooseFBConfig (nelements=0x7ffe81fd396c, 
    attrib_list=0x7ffe81fd3450, screen=0, dpy=0x564400347180)
    at /usr/src/debug/VirtualGL-2.5.2-4.fc31.x86_64/server/faker-sym.h:389
#11 glxvisual::configsFromVisAttribs (attribs=<optimized out>, 
    c_class=<optimized out>, level=<optimized out>, stereo=@0x7ffe81fd38e4: 0, 
    trans=@0x7ffe81fd38e8: 0, nElements=@0x7ffe81fd396c: 0, glx13=true)
    at /usr/src/debug/VirtualGL-2.5.2-4.fc31.x86_64/server/glxvisual.cpp:279
#12 0x00007f926102276d in glXChooseFBConfig (dpy=0x564400336980, screen=0, 
    attrib_list=0x7ffe81fd3a00, nelements=0x7ffe81fd396c)
    at /usr/src/debug/VirtualGL-2.5.2-4.fc31.x86_64/server/faker-glx.cpp:312
#13 0x00005643fed3a13e in main (argc=<optimized out>, argv=<optimized out>)

so it looks like some amount of DRM ioctl support would still be required to use vglrun this naively.

rocallahan avatar Apr 15 '20 13:04 rocallahan

(But maybe I'm doing it wrong.)

rocallahan avatar Apr 15 '20 13:04 rocallahan

I misunderstood what VirtualGL does. An application using VirtualGL is still intended to use direct rendering. VirtualGL captures the resulting images and forwards them to another display. It is not intended to be a faster/more complete GLX, which is what we want. It does intercept GL calls so it might be usable as the basis for implementing a GL forwarding solution.

rocallahan avatar Apr 16 '20 05:04 rocallahan

This is possibly helpful: https://github.com/jrmuizel/rr-dataflow

It shows how mesa softpipe renderer can be used to track pixel changes.

neon12345 avatar May 17 '20 05:05 neon12345

I stumbled upon this ticket while considering a similar feature, but here I'd like to propose a different implementation.

While graphics is currently considered as a kind of side effect in rr, for debugging it's better to have it as a part of program state, where we can see what's rendered up to the point we have replayed the program to. So to achieve that I propose that we integrate with some kind of graphics API call recording tool, such as RenderDoc or apitrace. These tools work by hooking the appropriate graphics APIs (OpenGL, Vulkan, etc rather than DRM), so what we need in rr is some way to record at graphics API level (rather than driver communication level) and disable syscall recording inside Mesa userland components. Does this sound feasible?

(The proposal above only concerns recording so far, for replaying we need additional coding in RenderDoc etc. to restore the GPU state up to a snapshot.)

As an optional addition, we can use some IPC-based graphics implementation to isolate potential undefined behavior, which is basically the virtualization idea initially described in this issue.

ishitatsuyuki avatar Sep 28 '20 07:09 ishitatsuyuki

what we need in rr is some way to record at graphics API level (rather than driver communication level) and disable syscall recording inside Mesa userland components. Does this sound feasible?

Unfortunately not. A fundamental invariant of rr is that during replay we reproduce the userspace register and memory state of the recorded processes. It doesn't seem feasible to separate out the state of a particular library in an address space and say "this will be different during replay". For example, the library will very likely share a memory allocator with other libraries we do want to record, so diverging behaviour of the library will cause memory addresses in other libraries to diverge, which breaks rr.

So, we could pick one or more open-source graphics drivers, like SVGA3D or Virgil, and support those directly --- if they don't share memory between userspace and the driver in ways that are too difficult to handle. Alternatively, we do a GLX-like thing that interposes on GL (or Vulkan, if we can make that work) and forwards everything to a process outside the recording.

rocallahan avatar Sep 29 '20 09:09 rocallahan

For Vulkan support, SwiftShader should work fine in CPU mode.

Manouchehri avatar Jul 27 '22 19:07 Manouchehri