qemu-3dfx icon indicating copy to clipboard operation
qemu-3dfx copied to clipboard

Pył dx12

Open roberttryton opened this issue 1 year ago • 3 comments

Pył in dosbox with dgvoodoo2 dx11 15fps Pył in dosbox with dgvoodoo2 dx12 15fps Pył in qemu with dgvoodoo2 dx11 70fps Pył in qemu with dgvoodoo2 dx12 9fps

roberttryton avatar Aug 25 '22 08:08 roberttryton

Please provide the save game for FPS measurement. I am getting similar FPS between dgVoodoo2 2.64 and 2.79.2. Below is the screenshot with dgVoodoo2 2.79.2 on Windows 11 WHPX Intel Iris Xe Core i7-1165G7.

image

I suppose 2.64 is dx11 and 2.79.2 is dx12 with best available APIs in dgvoodoo.conf.

kjliew avatar Aug 25 '22 15:08 kjliew

To use dx12 you have to choose it in dgvoodoo2's options for now. "best" is not choosing dx12 yet I guess for this reason.

(measured on first interactive screen of the game (after intro) without moving the mouse/player)

roberttryton avatar Aug 25 '22 16:08 roberttryton

OK, I got it reproduced, but I don't believe it has anything to do with QEMU implementation. The issue seems likely to be dgvoodoo2 d3d12 having Glide LFB that is slow in CPU bulk read for LFB fill in Shared Memory. It may not even be dgvoodoo2's fault if this is controlled by GPU driver. In QEMU TCG, the LfbHandler,1 is faster with d3d12 and similar to dosbox results.

For the equivalent of LfbHandler,1 with QEMU acceleration, you can try LfbMapBufo,1. Only KVM and WHPX are supported because it makes use of hypervisor IOMMU/SLAT to map region of memory directly from Host into Guest, ie. Zero-Copy. IOMMU/SLAT direct mapping of GPU memory has many unknown issues. Perhaps it is not widely used and fully tested yet. I have found that Guest CPU accesses is ridiculously slow for SLAT mapped memory. Other potential caveats are mapping of GPU memory across PCIe or iGPU/APU carved-out memory region may fail or forcing hypervisor to emulate such accesses.

LfbMapBufo,1 restores the performance of dgvoodoo2 d3d12 but it is still much slower than d3d11. If you try LfbMapBufo,1 with dgvoodoo2 d3d11, then you will find that it is slightly slower than LFB fill One-Copy Shared Memory with CPU bulk read despite being Zero-Copy Shared Memory.

If you get the chance to talk to dgvoodoo2 author, then you can check if there is a way to control attributes of GPU memory in d3d12. Perhaps he could also review the differences of d3d11 and d3d12 implementation of Glide LFB. QEMU Glide pass-through is really simple and has no notion of specific backends of Glide wrappers (d3d9, d3d11, d3d12, opengl, vulkan etc.)

kjliew avatar Aug 25 '22 18:08 kjliew

It is interesting that if you have these three options together: LfbMapBufo,1 [Glide] Force Emulating True PCI Access enabled whpx Then Pył starts, resizes window, you hear sound, but the screen is black, mouse works but is not visible.

roberttryton avatar Sep 09 '22 18:09 roberttryton

Yes, you're right about the black screen. LfbMapBufo,1 requires supported hypervisors. If you set it without one, then it will just be dropped silently. LfbMapBufo is a new config unexpected by dgVoodoo2 author due to VOGONS Moderator banning my participation. Either my ban was lifted or Dege directly engaged here at GitHub, then only we can see if this new config would improve dgVoodoo2 along with QEMU.

Shame on VOGONS Moderator in despising the existence of QEMU for Game Preservation.

kjliew avatar Sep 09 '22 18:09 kjliew