aomp
aomp copied to clipboard
[amdgpu] compiling for gfx9 spills a LOT into 64 vgpr
[reposted from llvm-project for more visibility] Hey guys, I am trying to compile some llvmir for gfx908, it seems like it have a lot of spiils but is only trying to fit in 64 vgpr, which is weird since IIRC it should have 256 vgpr.
Here is the llvmir I use to compile: ll file
Compiling with llc-13 will generate an assembly file where we can see that it has 654 vgpr spill with 64 vgpr count.
/usr/bin/llc-13 -mtriple=amdgcn--amdhsa-amdgiz --march=amdgcn -mcpu=gfx908 /home/stanley/output__inference_learn_29220_dispatch_227.ll -filetype=asm
# output assembly file seen [here](https://gist.github.com/raikonenfnu/763bf67d0f9f8d3833fd3f3978bb2d8e#file-output__inference_learn_29220_dispatch_227-s-L5325-L5336)
I have also tried compiling with llc generated from llvm-project linked in aomp, which would present a detailed error of:
./llc -mtriple=amdgcn--amdhsa-amdgiz --march=amdgcn -mcpu=gfx908 /home/stanley/output__inference_learn_29220_dispatch_227.ll -filetype=asm
LLVM ERROR: Error while trying to spill VGPR0 from class VGPR_32: Cannot scavenge register without an emergency spill slot!
PLEASE submit a bug report to https://bugs.llvm.org/ and include the crash backtrace.
Stack dump:
0. Program arguments: ./llc -mtriple=amdgcn--amdhsa-amdgiz --march=amdgcn -mcpu=gfx908 /home/stanley/output__inference_learn_29220_dispatch_227.ll -filetype=asm
1. Running pass 'CallGraph Pass Manager' on module '/home/stanley/output__inference_learn_29220_dispatch_227.ll'.
2. Running pass 'Prologue/Epilogue Insertion & Frame Finalization' on function '@__inference_learn_29220_dispatch_227'
Could there be some bug in the spill/register allocation code for AMDGPU backend? Any advice/comments/guidance would be very helpful!