seamless_communication icon indicating copy to clipboard operation
seamless_communication copied to clipboard

can seamless perform time stamping?

Open pallavidevi640 opened this issue 2 years ago • 5 comments

can seamless perform time stamping?

pallavidevi640 avatar Sep 22 '23 05:09 pallavidevi640

You can do it today by adding a code wrapper that would record encoder-decoder attention weights. Then use them to infer alignment between the input audio frames and the output text tokens. We plan to add this wrapper for speech recognition task (ASR) in the near future (November-December).

mavlyutovr avatar Sep 26 '23 03:09 mavlyutovr

This is for speech to text? like we will get word timestamps?

If so, then can you nudge me to the right direction where to add code wrapper that would record encoder-decoder attention weights?

Thank you

kartik1225 avatar Sep 28 '23 21:09 kartik1225

@mavlyutovr any updates on this?

kurianbenoy avatar Feb 07 '24 11:02 kurianbenoy

You can do it today by adding a code wrapper that would record encoder-decoder attention weights. Then use them to infer alignment between the input audio frames and the output text tokens. We plan to add this wrapper for speech recognition task (ASR) in the near future (November-December).

@mavlyutovr Looking forward to this ASR wrapper, too. Thanks to you and your team for this great project!

fumin avatar Feb 20 '24 06:02 fumin