Windsor Nguyễn
Results
2
comments of
Windsor Nguyễn
I find that adding dropout decreases performance for state space models. Does anyone else also observe this phenomenon?
Still running into the same issues with CUDA 12.4, torch=2.5.0.dev20240709+cu124, triton=2.3.1