Tanmay Laud
Tanmay Laud
@koaning I am guessing Rasa uses some form of beam search. In that case a minimum span tree could be created and then the chosen beam could be highlighted. And...
Thanks for sharing @yemikifouly !
#self-assign
#self-assign
I am facing the same issue
@jishengpeng awaiting your response on this. Thanks!
having this issue with 70b bf16, 405b bf16 and fp8. is there a root cause analysis on this ?
It is the flashinfer sampling that is causing the determinism issue. Do we have any fixes for that? Torch sampling with argmax is deterministic but it is slower.
@jishengpeng what did the training data look like for the text-speech alignment ?