Rob Lyerly
Rob Lyerly
It seems like this is a race condition when bulk-migrating lots of threads back to the origin -- when kernel printing is turned on for thread migrations, the issue seems...
I did some digging and the GPU is definitely not running out of memory, I added print statements and it was only allocating several hundred MBs of GPU RAM. Instead...
Thanks for replying! Yeah, unfortunately my system is kind of a weird one - it's one of the Kaby Lake G laptops with the Vega M GPU that Intel was...
Update -- when digging into some TLS fixes I learned that LLVM uses R_AARCH64_TLSLE_ADD_TPREL_[HI12 | LO12] relocations for TLS addresses. These two relocations combine together to give a total addressable...
I know this patch fixes this particular problem for EP, but it's not very robust (plus having gigantic alignment directives all over the place balloons the size of the binary)....
This is awesome, thanks! Once you verify that it works, you can submit a pull request or let me know if you want me to implement. Just a couple of...
**More background** The rematerializing code is built to cheaply reproduce a small number of values for a single instruction. Since almost all machine instructions only use a handful of values,...
This looks like a copy-and-paste error -- I think aarch64_bin should simply be powerpc64_bin. Can you submit a pull request with the fix?
@mohamed-karaoui is this issue now closed?