CppSPMD_Fast
CppSPMD_Fast copied to clipboard
Build out fp32 functionality a bit more
Add utility functions (floatbits, intbits, shuffle, bit twiddling, rsqrt_fast) Add float3 / float4 helpers. Change print functions to inline to get around multiple includes.
Apologies for not including full implementations for all targets!