Nabla icon indicating copy to clipboard operation
Nabla copied to clipboard

Remote Rendering Device

Open nahiim opened this issue 3 years ago • 0 comments

THE CHALLENGE

The Engine currently has 3 different backend APIs(vulkan, OpenGL and GLES) but the debugger(Renderdoc) can't handle it when we try to run an instance each of more than one backend in parallel.

THE SOLUTION

We want to abstract/separate the GPU API code into separate processes.

The goal is to have seperate processes for the different backends in order to be able to test run an instance each of all backends simultaneously in different processes. We do this by providing a "wrapper" for all IGPU interface objects to communicate such that the actual API calls can be run in another process or even another machine.

The idea is to have a "special" video::CRemoteServer which would use a "communication provider" (dependency injection, basically an interface for sending/receiving command structs). We'd have an extra sample, lets say RemoteRenderServer which would spin up the CRemoteServer(smart_refctd_ptr<IAPIConnection>&&, smart_refctd_ptr<ICommunicationProvider>&&) from a real GL/GLES/VK API connection and open a TCP socket or allocate some Shared Memory and start waiting for a connection (that would be the responsibility of the ICommunicationProvider) it would spit out the TCP socket + IP or the Smem address to stdio and/or a text file then when we run another example and open a CRemoteConnection(smart_refctd_ptr<ICommunicationProvider>&&) we'd give it another communication provider, but this time in "client mode", basically created from the TCP socket its supposed to connect to or the Shared Memory its supposed to import.

CRemoteServer would deserialize and forward the call to the proper PHYSICAL API and CRemoteLogicalDevice would serialize and send via the RemoteConnection's communication provider.

Basically, there's a few things we need to do:

  • For every Physical Device report that the device type is VIRTUAL_GPU
  • For every COHERENT memory heap/type, remove the COHERENT flag, so then memory always would need to be flushed[1]
  • slight backflips with IDeviceMEmoryAllocation implementation to handle buffer mapping

The Remote Connection needs to be initialized before physical device creation, because some features should be queried from the physical API.

However, if we just use Shared Memory (same PC, different process) you can have the CRemoteServer allocate some Shared Interprocess Memory then a flush/invalidate is a memcpy from Shared Memory to the API's DeviceMemoryAllocation's mapped pointer (which still needs to be done Server Side) so there would be less data to serialize.

IN SUMMARY

We want to be able to render via multiple APIs (Vulkan, GL, GLES) at the same time. Renderdoc won't know which API to capture when running 3 APIs in the same process. Therefore, we run multiple processes, each with an API so that we could have many processes running and see all the different APIs run simultaneously. We can either use Inter-Process Communication and have separate processes on the same machine or TCP connection to get commands from client to server to call the API calls in a distributed system.

[1] Flush = Upload over TCP Invalidate = Download over TCP

nahiim avatar Jun 10 '22 15:06 nahiim