Add TMA example for Hopper H100
This sample code shows how to create a TMA descriptor using the driver API and how to initiate a TMA transfer using inline PTX.
I have not yet gotten the chance to copy over the Makefile from other directories. What is the preferred solution here? Also for creating the Visual Studio solution files?
This example can only be compiled with -arch sm_90. Previous architectures are not supported.
Maybe modify the printf to state that the following code will fail. You have
Care must be taken to ensure that the coordinates result in a memory offset that is aligned to 16 bytes. With 32 bit integer elements, x coordinates that are not a multiple of 4 result in a non-recoverable error:
Maybe add the following functions should fail due to ....
Thanks! I agree that seeing the end of the example fail is confusing. I have taken your suggestion and also added (as expected) to each line with a failure. This should make it more obvious that the final lines document that the kernels should fail.
**NOTE**: The following code will fail.
Care must be taken to ensure that the coordinates result in a memory offset
that is aligned to 16 bytes. With 32 bit integer elements, x coordinates
that are not a multiple of 4 result in a non-recoverable error:
CUDA error (as expected): an illegal instruction was encountered globalToShmemTMACopy.cu 346
CUDA error (as expected): an illegal instruction was encountered globalToShmemTMACopy.cu 348
CUDA error (as expected): an illegal instruction was encountered globalToShmemTMACopy.cu 350
CUDA error (as expected): an illegal instruction was encountered globalToShmemTMACopy.cu 352
Thanks, run first, ask questions later, that's my motto... when I see errors I tend to look for problems instead of thinking, learning exercise :)
Haha no problem. As they say "you don't have to prepare to win the lottery, but the lottery has to prepare for someone to win". There will always be somebody running the samples in a hurry, even if many people won't. Better to have them covered from the start :)