Windsor Nguyễn

Results 2 comments of Windsor Nguyễn

I find that adding dropout decreases performance for state space models. Does anyone else also observe this phenomenon?

Still running into the same issues with CUDA 12.4, torch=2.5.0.dev20240709+cu124, triton=2.3.1