Ankit Mathur
Results
23
comments of
Ankit Mathur
Also, @byshiue it's maybe a little unclear to me how to know whether flash attention is being used - is this exposed in any of the models that are currently...
I'm not an expert on the FMHA environment variable, so I can't really help much here. That being said, accuracy concerns is totally different than "output is garbage", so I'm...
cc: @harupy @dbczumar - @niklasdiehm I think we'd be happy to help out if you give it a try!