composable_kernel
composable_kernel copied to clipboard
Bwd dropout update
Updated judgement of dropout. Performance is improved when p_drop = 0. G0 G1 M K 54 16 512 64 : before : 4.49336 ms, 32.2599 TFlops, 101.206 GB/s -> now: 3.51236 ms, 41.27 TFlops, 129.472 GB/s 54 16 512 128 : before : 13.0223 ms, 22.2627 TFlops, 69.7067 GB/s -> now: 9.29781 ms, 31.1805 TFlops, 97.6293 GB/s