mesa icon indicating copy to clipboard operation
mesa copied to clipboard

GFX10 and GFX10.3 (Navi, RDNA) support in ACO

Open Venemo opened this issue 6 years ago • 109 comments

This issue is for tracking ACO's progress on Navi.

What works, what doesn't

All shader stages should work. Every Vulkan game should work.

If you find issues, please file a bug in the upstream Mesa bug tracker.

Tested hardware

  • [x] Navi 10: Radeon RX 5700
  • [x] Navi 10: Radeon RX 5700 XT
  • [x] Navi 10: Radeon RX 5600 XT tested and benchmarked by Phoronix
  • [x] Navi 14: Radeon RX 5500 XT tested and benchmarked by Phoronix
  • [ ] Navi 12 should work, but not tested
  • [x] Navi 2x: RX 6800 and 6800XT [tested and benchmarked by Phoronix] (https://www.phoronix.com/scan.php?page=article&item=rx6800-more-performance&num=1)

Not tested with unreleased Navi cards as we don't have those. If you test with hardware that is not on the list yet, please let us know.

How to test

We suggest using the latest stable mesa, where ACO is the default compiler of the RADV Vulkan driver.

ACO is in mesa since version 19.3 but on old mesa releases, the RADV_PERFTEST=aco environment variable was needed.

New hardware features support in Navi 1x

  • [x] Wave32 (support for 32 lanes rather than 64)
  • [x] NGG (Next Generation Geometry)

New hardware features support in Navi 2x

  • [ ] Hardware accelerated ray tracing
  • [ ] Mesh shaders
  • [ ] Variable rate shading

Possible optimizations

  • [ ] use round-robin register allocation to avoid WAR hazards (and help any post-RA scheduling)

  • [ ] schedule ALU instructions (after RA for easier/faster scheduling?)

  • [ ] choose registers to avoid bank conflicts (either as a reassignment pass or during RA)
    See GCNRegBankReassign.cpp in LLVM

  • [ ] NGG shader based primitive culling

Venemo avatar Sep 17 '19 07:09 Venemo

So, that's what the ACO developers have been doing for the past month.

SR-dude avatar Sep 17 '19 17:09 SR-dude

Just tested aco-navi branch with The Witcher 3 in Wine-esync+dxvk (Sapphire Pulse RX 5700 XT). It causes a GPU hang with this in dmesg (computer is accessible remotely through ssh):

[   52.097894] [drm:amdgpu_dm_atomic_commit_tail [amdgpu]] *ERROR* Waiting for fences timed out or interrupted!
[   57.207014] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx_0.0.0 timeout, signaled seq=2248, emitted seq=2251
[   57.207080] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process witcher3.exe pid 2612 thread witcher3.exe pid 2687
[   57.207083] [drm] GPU recovery disabled.

shmerl avatar Sep 18 '19 00:09 shmerl

@shmerl Basically most GPU hangs look like that in dmesg, so that message doesn't bring us closer to finding the problem. Can you try to identify which kind of shader is it that causes the hang? As a first step, can you try disabling CS in ACO? In radv_pipeline.c you can edit radv_aco_supported_stage, just comment out the CS support from there.

Venemo avatar Sep 18 '19 07:09 Venemo

Also please compile mesa in debug mode, just to see if it hits any assertions and such.

Venemo avatar Sep 18 '19 08:09 Venemo

Some fixes since yesterday:

  • Fixed SLC, GLC, DLC bits in FLAT, Scratch and Global instructions
  • Set DLC bit only for loads, don't set it according to GLC
  • Added some RDNA ISA comments to ACO README
  • Enabled SMEM SOE optimization on Navi (apparently it is supported even though there is no SOE bit)

Venemo avatar Sep 18 '19 10:09 Venemo

Compiled with debug and captured some output. Without disabling CS:

info:  Game: witcher3.exe
info:  DXVK: v1.3.4
warn:  OpenVR: Failed to locate module
info:  Enabled instance extensions:
info:    VK_KHR_get_physical_device_properties2
info:    VK_KHR_surface
info:    VK_KHR_win32_surface
WARNING: Experimental compiler backend enabled. Here be dragons! Incorrect rendering, GPU hangs and/or resets are likely
WARNING: radv is not a conformant vulkan implementation, testing use only.
info:  AMD RADV/ACO NAVI10 (LLVM 10.0.0):
info:    Driver: 19.2.99
info:    Vulkan: 1.1.107
info:    Memory Heap[0]: 
info:      Size: 7920 MiB
info:      Flags: 0x1
info:      Memory Type[0]: Property Flags = 0x1
info:    Memory Heap[1]: 
info:      Size: 256 MiB
info:      Flags: 0x1
info:      Memory Type[2]: Property Flags = 0x7
info:    Memory Heap[2]: 
info:      Size: 8176 MiB
info:      Flags: 0x0
info:      Memory Type[1]: Property Flags = 0x6
info:      Memory Type[3]: Property Flags = 0xe
info:  D3D11CoreCreateDevice: Probing D3D_FEATURE_LEVEL_11_0
info:  D3D11CoreCreateDevice: Using feature level D3D_FEATURE_LEVEL_11_0
info:  Device properties:
info:    Device name:     : AMD RADV/ACO NAVI10 (LLVM 10.0.0)
info:    Driver version   : 19.2.99
info:  Enabled device extensions:
info:    VK_EXT_conditional_rendering
info:    VK_EXT_depth_clip_enable
info:    VK_EXT_host_query_reset
info:    VK_EXT_memory_priority
info:    VK_EXT_shader_demote_to_helper_invocation
info:    VK_EXT_shader_stencil_export
info:    VK_EXT_shader_viewport_index_layer
info:    VK_EXT_transform_feedback
info:    VK_EXT_vertex_attribute_divisor
info:    VK_KHR_create_renderpass2
info:    VK_KHR_dedicated_allocation
info:    VK_KHR_depth_stencil_resolve
info:    VK_KHR_descriptor_update_template
info:    VK_KHR_draw_indirect_count
info:    VK_KHR_driver_properties
info:    VK_KHR_get_memory_requirements2
info:    VK_KHR_image_format_list
info:    VK_KHR_maintenance1
info:    VK_KHR_maintenance2
info:    VK_KHR_sampler_mirror_clamp_to_edge
info:    VK_KHR_shader_draw_parameters
info:    VK_KHR_swapchain
info:  Device features:
info:    robustBufferAccess                     : 1
info:    fullDrawIndexUint32                    : 1
info:    imageCubeArray                         : 1
info:    independentBlend                       : 1
info:    geometryShader                         : 1
info:    tessellationShader                     : 1
info:    sampleRateShading                      : 1
info:    dualSrcBlend                           : 1
info:    logicOp                                : 1
info:    multiDrawIndirect                      : 1
info:    drawIndirectFirstInstance              : 1
info:    depthClamp                             : 1
info:    depthBiasClamp                         : 1
info:    fillModeNonSolid                       : 1
info:    depthBounds                            : 1
info:    multiViewport                          : 1
info:    samplerAnisotropy                      : 1
info:    textureCompressionBC                   : 1
info:    occlusionQueryPrecise                  : 1
info:    pipelineStatisticsQuery                : 1
info:    vertexPipelineStoresAndAtomics         : 0
info:    fragmentStoresAndAtomics               : 1
info:    shaderImageGatherExtended              : 1
info:    shaderStorageImageExtendedFormats      : 1
info:    shaderStorageImageReadWithoutFormat    : 0
info:    shaderStorageImageWriteWithoutFormat   : 1
info:    shaderClipDistance                     : 1
info:    shaderCullDistance                     : 1
info:    shaderFloat64                          : 1
info:    shaderInt64                            : 1
info:    variableMultisampleRate                : 1
info:  VK_EXT_conditional_rendering
info:    conditionalRendering                   : 1
info:  VK_EXT_depth_clip_enable
info:    depthClipEnable                        : 1
info:  VK_EXT_host_query_reset
info:    hostQueryReset                         : 1
info:  VK_EXT_memory_priority
info:    memoryPriority                         : 1
info:  VK_EXT_shader_demote_to_helper_invocation
info:    shaderDemoteToHelperInvocation         : 1
info:  VK_EXT_transform_feedback
info:    transformFeedback                      : 1
info:    geometryStreams                        : 1
info:  VK_EXT_vertex_attribute_divisor
info:    vertexAttributeInstanceRateDivisor     : 1
info:    vertexAttributeInstanceRateZeroDivisor : 1
info:  Queue families:
info:    Graphics : 0
info:    Transfer : 0
info:  DXVK: Read 471 valid state cache entries
info:  DXVK: Using 16 compiler threads
warn:  DXGI: VK_FORMAT_D24_UNORM_S8_UINT -> VK_FORMAT_D32_SFLOAT_S8_UINT

Hangs after that.

shmerl avatar Sep 19 '19 01:09 shmerl

When disabling CS, it hangs too fast, and capturing the output just produces an empty log for me.

Let me know, if you need the game to test it, may be developers can provide a key.

shmerl avatar Sep 19 '19 01:09 shmerl

All DXVK/D9VK games that I've tested gpu hang on launch for me, so it might not be just a Witcher 3 issue.

Here's the list: GTA4, GTA5, The Witcher 3, Mirror's Edge.

And, assuming I'm commenting out the correct line (stage == MESA_SHADER_COMPUTE), disabling CS has no effect.

I'm also using a custom card (MSI Evoke 5700xt) so maybe that's related.

aqxa1 avatar Sep 19 '19 07:09 aqxa1

Thanks guys for your testing. I haven't tested any DXVK games yet so it's entirely possible that their shaders do something that aco-navi isn't prepared for. I'm currently working on implementing subgroup shuffles, but I promise I'll look into what is going on with DXVK.

Venemo avatar Sep 19 '19 08:09 Venemo

@Venemo Actually, even vkcube causes a GPU hang for me.

With the following error: amdgpu: radv_amdgpu_cs_query_fence_status failed. amdgpu: The CS has been rejected, see dmesg for more information. vk: error: failed to submit CS 0

Does that mean I failed to disable ACO's CS, or that Mesa's CS is hanging as well? It doesn't hang with normal LLVM Mesa, for the record.

aqxa1 avatar Sep 19 '19 08:09 aqxa1

@aqxa1 @shmerl Are you guys sure that you disabled NGG during your testing? I think I mentioned in the original post that NGG is not implemented in ACO yet, but just to clarify I now edited the post and added a few env vars to "How to test".

This works for me without hanging:

RADV_DEBUG=nongg,nocache RADV_PERFTEST=aco vkcube

Venemo avatar Sep 19 '19 08:09 Venemo

@Venemo That fixes the issue, thanks. I had assumed it just wasn't used rather than it needing to be explicitly disabled.

aqxa1 avatar Sep 19 '19 09:09 aqxa1

Just ran some quick tests. Other than some random GPU hangs, I also get the following error with TW3 and GTA5: ../src/amd/vulkan/radv_descriptor_set.c:496: VK_ERROR_OUT_OF_POOL_MEMORY

It's accompanied with missing/flickering models with GTA5 at least (I didn't test TW3 extensively).

aqxa1 avatar Sep 19 '19 10:09 aqxa1

The VK_ERROR_OUT_OF_POOL_MEMORY error happens sometimes for me, I don't think it's an actual issue or related to ACO

pendingchaos avatar Sep 19 '19 10:09 pendingchaos

The VK_ERROR_OUT_OF_POOL_MEMORY error happens sometimes for me, I don't think it's an actual issue or related to ACO

I should add it doesn't occur with the same games under regular Mesa LLVM, and I get hundreds of them almost immediately when getting in game.

aqxa1 avatar Sep 19 '19 11:09 aqxa1

Were you using a debug build? I think the error still happens with release builds but is only actually printed with debug builds.

If I'm remembering how DXVK handles descriptor pools correctly, this error is expected and DXVK handles it fine EDIT: I might not be remembering correctly EDIT2: I'm remembering correctly. DXVK allocates a new descriptor pool when the current one is out of memory

pendingchaos avatar Sep 19 '19 11:09 pendingchaos

@aqxa1 @shmerl If you guys still experience hangs or other problems with nongg then please give us a bit more details on what issues there are and how to reproduce those.

Venemo avatar Sep 19 '19 11:09 Venemo

@pendingchaos You're right, it was just using a debug build that was caused the error messages.

@Venemo Would you prefer to add issues here, or open separate bugs?

aqxa1 avatar Sep 19 '19 11:09 aqxa1

I think separate issues would be best. Just mention in those issues that they are about aco-navi.

Venemo avatar Sep 19 '19 11:09 Venemo

Thanks for the hint. Using RADV_DEBUG=nongg,nocache I was able to start TW3, so that hang was related. It still hangs sometime later, but at least it starts. I'll try to capture some output from the other hang later.

shmerl avatar Sep 19 '19 13:09 shmerl

Also, now that aco is merged into upstream, do we still need to use the external repo for testing (including for aco-navi?).

shmerl avatar Sep 19 '19 13:09 shmerl

This github repo will include some optimizations that haven't yet been upstreamed for a while The contents of the aco-navi branch is not upstreamed, so you will need it to test ACO with Navi

pendingchaos avatar Sep 19 '19 13:09 pendingchaos

@shmerl I really appreciate your enthusiasm, but please keep in mind that aco-navi is still under heavy development, at this point I'm happy that it can run a bunch of example programs. (When I started, it could not even run the simplest triangle example.) This sort of work is not always as straightforward as it seems. I do plan to install the Witcher 3 on my computer and use that for testing, but only after fixing the issues that I currently know about. So no need to capture it.

Venemo avatar Sep 19 '19 15:09 Venemo

Great, thanks for the effort! I'll test it periodically then, to see how goes the progress.

shmerl avatar Sep 19 '19 15:09 shmerl

With regards to your question: at some point I'm gonna rebase my branch on upstream (after all the NIR stuff is merged, possibly), but I don't want to send aco-navi upstream until I'm satisfied that at least most of the popular games work well.

Venemo avatar Sep 19 '19 15:09 Venemo

Pardon me for being a bit clueless, but currently I’m running a 5700XT and am very interested in helping test aco-navi. i tried the mesa-aco-git package from lcarlier’s mesa-git repository, but obviously aco-navi isn’t on you guys’ master branch yet, and i got a GPU hang on boot. had to hard power cycle and boot straight into tty to fix, very messy. how would i go about installing aco-navi? is there an arch/aur/third party repo package i can install? i don’t really get how meson works and something about screwing with my graphics drivers seems a bit terrifying.

wherron01 avatar Sep 25 '19 09:09 wherron01

@wherron01 The short answer is that it's not there yet. There is no easy way to install it unless you are comfortable compiling mesa on your own. Please give us a couple of weeks more to get it right. :)

Venemo avatar Sep 25 '19 11:09 Venemo

The aco-navi branch has gone through a major cleanup and is now rebased on top of latest mesa master (as of yesterday). There are still some issues, but overall it works much better for me. I haven't installed The Witcher 3 yet, but I did some testing with Dota 2.

Venemo avatar Oct 04 '19 10:10 Venemo

Thanks! I'll give it a go a bit later and will post results!

shmerl avatar Oct 04 '19 13:10 shmerl

Just tested The Witcher 3, it's still hanging after loading a saved game.

shmerl avatar Oct 04 '19 21:10 shmerl