llm.c icon indicating copy to clipboard operation
llm.c copied to clipboard

float4 with better vectorization for adamw.cu

Open lancerts opened this issue 10 months ago • 0 comments

On 3070, Kernel 2 time gpu 0.0799 ms time cpu 0.0168 ms

Kernel 3 time gpu 0.0780 ms time cpu 0.0166 ms

lancerts avatar Apr 27 '24 06:04 lancerts