seamless_communication
seamless_communication copied to clipboard
can seamless perform time stamping?
can seamless perform time stamping?
You can do it today by adding a code wrapper that would record encoder-decoder attention weights. Then use them to infer alignment between the input audio frames and the output text tokens. We plan to add this wrapper for speech recognition task (ASR) in the near future (November-December).
This is for speech to text? like we will get word timestamps?
If so, then can you nudge me to the right direction where to add code wrapper that would record encoder-decoder attention weights?
Thank you
@mavlyutovr any updates on this?
You can do it today by adding a code wrapper that would record encoder-decoder attention weights. Then use them to infer alignment between the input audio frames and the output text tokens. We plan to add this wrapper for speech recognition task (ASR) in the near future (November-December).
@mavlyutovr Looking forward to this ASR wrapper, too. Thanks to you and your team for this great project!