Dustyn Blasig

Results 29 comments of Dustyn Blasig

Hey @cliffburdick, I finally got around to fixing and cleaning up some of the CUTLASS CMake code. I tested this out and it worked fine for me using CMake 3.18....

Can you provide some details about the library/application you are linking cutlass with? Does it include CUDA language support already, for instance? Does it compile other `.cu` files successfully?

Please set the flag `/Zc:__cplusplus` as shown in https://github.com/NVIDIA/cutlass/blob/7d49e6c7e2f8896c47f586706e67e1fb215529dc/CMakeLists.txt#L439-453 to get MSVC to set the compile options appropriately.

@hwu36 to confirm, but I believe this fix will come with 3.5.1. If not, we can certainly merge it shortly after.

The profiler will run any kernels capable of implementing the given problem requirements. In this case, the constraints are pretty general and many kernels will be able to support a...

Have you tried changing the accumulation type to fp32? See https://docs.nvidia.com/cuda/ampere-tuning-guide/index.html for details on which datatypes configurations are supported.

@leimao, is this fixing a failure you were seeing? If so, please elaborate. If not, the recommendation would be to allow the compiler to do its thing with the code...

@Anoncheg1, please list your full CMake configuration and build commands as well as what changes you made to source. Thanks!

Thanks for working on this @BwL1289, always glad to get help on the build side!

> Happy to help! Another potential improvement is allowing the user to specify their own `cxx_std_XX` (instead of hardcoding `cxx_std_17`) Making that configurable may work, although we'd need to ensure...