failed to convert Vulkan driver statistics to RGA format on Linux
Hi,
I'm trying to get a very simple pipeline analyzed, but I can't get RGA to work in online mode. The output just says Error: failed to convert Vulkan driver statistics to RGA format.
I believe it's failing right here https://github.com/GPUOpen-Tools/radeon_gpu_analyzer/blob/f2cb7cf71ed620c427e2625aba3e85b70e9537bb/RadeonGPUAnalyzerCLI/Src/kcCLICommanderVulkan.cpp#L522
I'm on Linux x64 5.8.7, AMDVLK 2020.Q3.4 and:
$ vulkaninfo | rg PhysicalDeviceProp -A10
VkPhysicalDeviceProperties:
---------------------------
apiVersion = 4202646 (1.2.150)
driverVersion = 8388763 (0x80009b)
vendorID = 0x1002
deviceID = 0x66af
deviceType = PHYSICAL_DEVICE_TYPE_DISCRETE_GPU
deviceName = AMD Radeon VII
Is this an incompatibility with the latest AMDVLK release?
Also a few minor questions if I may:
- Is the PSO format generated with Fossilize? I saw the one for DX12 and it looks completely custom, while the Vulkan one is just a JSON dump
- Will there be support for source mapping in the future? I wanted to create a Language Server Protocol to embed live ISA & register pressure in my code editor while editing shaders, but there doesn't seem to be a way to map this analysis back to the source.
Hi farnoy,
- What Linux variant are you using? Note that RGA officially only support Ubuntu.
- Is there a Vulkan ICD manifest file present under /opt/amdgpu-pro/etc/vulkan/icd.d/amd_icd64.json on your system? If you set the VK_ICD_FILENAMES environment variable to /opt/amdgpu-pro/etc/vulkan/icd.d/amd_icd64.json - are you still seeing the same error? This should force the amdgpu-pro driver to be used (which is the driver RGA relies on).
-=-=-=-
To your questions:
- The .gpso and .cpso file formats were derived from the Fossilize format. At the time of development, Fossilize used to pack the SPIR-V binaries inside the file and also supported packing multiple pipelines within the same Fossilize file. RGA's .cpso and .gpso files describe a single pipeline and do not store the SPIR-V binaries .
- I assume that you are referring to correlation from GLSL/SPIR-V to ISA disassembly, similarly to what RGA supports in ROCm OpenCL mode. This feature is in our roadmap but since it requires updates to AMD's shader compiler it is not expected to be available soon.
I'm on Archlinux and my AMDVLK installation is being built from source with this script
I used VK_ICD_FILENAMES in the original report, I have a separate Mesa radv stack that I didn't want RGA to use.
Thanks for taking my questions, I was indeed referring referring to the OpLine instructions that both glslang and dxc can output. It should be a useful feature for livereg and/or assembly when it's ready.
I did a bit more digging and found something interesting. I've modified the rga bash script wrapper to execute rga-bin --verbose "$@". This showed me intermediate commands in the GUI window. The full output was:
Building Vulkan project "asd" for gfx906
./rga -s vulkan --isa "/home/kuba/RadeonGPUAnalyzer/projects/asd/Output/Clone0/disassem.txt" --parse-isa --line-numbers --analysis "/home/kuba/RadeonGPUAnalyzer/projects/asd/Output/Clone0/resourceUsage.csv" -b "/home/kuba/RadeonGPUAnalyzer/projects/asd/Output/Clone0/codeobj.bin" --log "/home/kuba/.local/share/RadeonGPUAnalyzer/rga-cli-20200909-214907.log" --icd "/usr/share/vulkan/icd.d/amd_icd64.json" --glslang-opt "@--target-env vulkan1.1@" --compiler-bin "/home/kuba/1.2.148.1/x86_64/bin" --session-metadata "/home/kuba/RadeonGPUAnalyzer/projects/asd/Output/Clone0/gfx906_cliInvocation.xml" --asic gfx906 --pso "/home/kuba/RadeonGPUAnalyzer/projects/asd/Clone0/Pipeline0.gpso" --vert "/data/renderer/src/shaders/gui.vert" --frag "/data/renderer/src/shaders/gui.frag"
Info: forcing the Vulkan runtime to load a custom ICD: /usr/share/vulkan/icd.d/amd_icd64.json
Launching external process: /home/kuba/rga/Vulkan//VulkanBackend --list-targets --icd /usr/share/vulkan/icd.d/amd_icd64.json Target GPU detected:
gfx906 (Vega) AMD Radeon VII
Pre-compiling vertex shader file (/data/renderer/src/shaders/gui.vert) to SPIR-V binary (/home/kuba/.rga/GPUOpen/rga/all-devices_rga-temp-out2393283_vert.spv)... Launching external process: /home/kuba/1.2.148.1/x86_64/bin/glslangValidator --target-env vulkan1.1 -V -o /home/kuba/.rga/GPUOpen/rga/all-devices_rga-temp-out2393283_vert.spv /data/renderer/src/shaders/gui.vert succeeded. Pre-compiling fragment shader file (/data/renderer/src/shaders/gui.frag) to SPIR-V binary (/home/kuba/.rga/GPUOpen/rga/all-devices_rga-temp-out2393283_frag.spv)... Launching external process: /home/kuba/1.2.148.1/x86_64/bin/glslangValidator --target-env vulkan1.1 -V -o /home/kuba/.rga/GPUOpen/rga/all-devices_rga-temp-out2393283_frag.spv /data/renderer/src/shaders/gui.frag succeeded. Building for gfx906... Launching external process: /home/kuba/rga/Vulkan//VulkanBackend --target gfx906 --vert /home/kuba/.rga/GPUOpen/rga/all-devices_rga-temp-out2393283_vert.spv --vert-isa /home/kuba/RadeonGPUAnalyzer/projects/asd/Output/Clone0/gfx906_disassem_vert.txt --vert-stats /home/kuba/RadeonGPUAnalyzer/projects/asd/Output/Clone0/gfx906_resourceUsage_vert.csv --frag /home/kuba/.rga/GPUOpen/rga/all-devices_rga-temp-out2393283_frag.spv --frag-isa /home/kuba/RadeonGPUAnalyzer/projects/asd/Output/Clone0/gfx906_disassem_frag.txt --frag-stats /home/kuba/RadeonGPUAnalyzer/projects/asd/Output/Clone0/gfx906_resourceUsage_frag.csv --bin /home/kuba/RadeonGPUAnalyzer/projects/asd/Output/Clone0/gfx906_codeobj.bin --pso /home/kuba/RadeonGPUAnalyzer/projects/asd/Clone0/Pipeline0.gpso --icd /usr/share/vulkan/icd.d/amd_icd64.json
Using Vulkan ICD from custom location: /usr/share/vulkan/icd.d/amd_icd64.json
failed. Error: failed to convert Vulkan driver statistics to RGA format.
However, when I ran the 3 leaf commands manually in a shell (two glslang's and VulkanBackend), it all works fine and to show this:
$ /home/kuba/rga/Vulkan//VulkanBackend --target gfx906 \
--vert /home/kuba/.rga/GPUOpen/rga/all-devices_rga-temp-out2393283_vert.spv \
--frag /home/kuba/.rga/GPUOpen/rga/all-devices_rga-temp-out2393283_frag.spv \
--frag-stats /dev/stdout \
--pso /home/kuba/RadeonGPUAnalyzer/projects/asd/Clone0/Pipeline0.gpso \
--icd /usr/share/vulkan/icd.d/amd_icd64.json
Using Vulkan ICD from custom location: /usr/share/vulkan/icd.d/amd_icd64.json
Statistics:
- shaderStageMask = 16
- resourceUsage.numUsedVgprs = 24
- resourceUsage.numUsedSgprs = 14
- resourceUsage.ldsSizePerLocalWorkGroup = 65536
- resourceUsage.ldsUsageSizeInBytes = 0
- resourceUsage.scratchMemUsageInBytes = 0
- numPhysicalVgprs = 256
- numPhysicalSgprs = 800
- numAvailableVgprs = 256
- numAvailableSgprs = 104
~So when the GUI invokes it, it fails, but when I do the same thing from from the shell it works.~ EDIT: nevermind, it's the GUI that throws an error
I tried redirecting the VulkanBackend binary with a script like this to enable api_dump:
#!/usr/bin/fish
ls ~/.rga/GPUOpen/rga/*.spv
set -x VK_INSTANCE_LAYERS VK_LAYER_LUNARG_api_dump
set -x VK_APIDUMP_LOG_FILENAME /tmp/vulkanbackend-api-dump
eval (dirname (status -f))/VulkanBackend-bin $argv
I also verified that the .spv files exist (they do) and set up API dump. But everything exits cleanly with the last API calls being:
Thread 0, Frame 0:
vkGetShaderInfoAMD(device, pipeline, shaderStage, infoType, pInfoSize, pInfo) returns VkResult VK_SUCCESS (0):
device: VkDevice = 0x37aa990
pipeline: VkPipeline = 0x326f760
shaderStage: VkShaderStageFlagBits = 16 (VK_SHADER_STAGE_FRAGMENT_BIT)
infoType: VkShaderInfoTypeAMD = VK_SHADER_INFO_TYPE_DISASSEMBLY_AMD (2)
pInfoSize: size_t* = 2156
pInfo: void* = 0x345c860
Thread 0, Frame 0:
vkGetShaderInfoAMD(device, pipeline, shaderStage, infoType, pInfoSize, pInfo) returns VkResult VK_SUCCESS (0):
device: VkDevice = 0x37aa990
pipeline: VkPipeline = 0x326f760
shaderStage: VkShaderStageFlagBits = 16 (VK_SHADER_STAGE_FRAGMENT_BIT)
infoType: VkShaderInfoTypeAMD = VK_SHADER_INFO_TYPE_STATISTICS_AMD (0)
pInfoSize: size_t* = 72
pInfo: void* = 0x7ffcf9966140
Thread 0, Frame 0:
vkDestroyPipeline(device, pipeline, pAllocator) returns void:
device: VkDevice = 0x37aa990
pipeline: VkPipeline = 0x326f760
pAllocator: const VkAllocationCallbacks* = NULL
So for each stage, it's collecting the binary, disassembly and statistics, all as expected. The only weird thing is that it returns fictional devices. I guess that's configured out of band because I only see the effects:
4808 │ Thread 0, Frame 0:
4809 │ vkGetPhysicalDeviceProperties(physicalDevice, pProperties) returns void:
4810 │ physicalDevice: VkPhysicalDevice = 0x384ab10
4811 │ pProperties: VkPhysicalDeviceProperties* = 0x7ffcf86bb060:
4812 │ apiVersion: uint32_t = 0
4813 │ driverVersion: uint32_t = 0
4814 │ vendorID: uint32_t = 0
4815 │ deviceID: uint32_t = 31
4816 │ deviceType: VkPhysicalDeviceType = VK_PHYSICAL_DEVICE_TYPE_OTHER (0)
4817 │ deviceName: char[VK_MAX_PHYSICAL_DEVICE_NAME_SIZE] = "NAVI14:gfx1012"
On the other hand, the GUI only has a problem with my vertex shader, if I remove it from the pipeline, offline mode is used but I don't see the error about driver statistics. If I remove the fragment shader and leave only the vertex, it fails again. Not sure what makes it so special, it's a very simple shader:
#version 450
#extension GL_EXT_scalar_block_layout: require
layout(push_constant, scalar) uniform PushConstants {
vec2 scale;
vec2 translate;
} pushConstants;
layout (location = 0) in vec2 pos;
layout (location = 1) in vec2 uv;
layout (location = 2) in vec4 col;
layout (location = 0) out vec4 out_color;
layout (location = 1) out vec2 out_uv;
void main() {
out_color = col;
out_uv = uv;
gl_Position = vec4(pos * pushConstants.scale + pushConstants.translate, 0, 1);
gl_Position.y *= -1.0;
}
I hope this helps. I understand that only Ubuntu is officially supported, but seeing as multiple other components are working fine, this seems to be a legitimate issue.