While addressing the following issue I realized that we haven't implemented tests with __CUDA_NO_HALF_CONVERSIONS__. The original motivation is described here.
__CUDA_NO_HALF_CONVERSIONS__