whisper-jax icon indicating copy to clipboard operation
whisper-jax copied to clipboard

Words timestamps [HELP]

Open RaulKite opened this issue 1 year ago • 10 comments

I'm not able to get the transcription with words timestamps. Only sentences timestamps.

If this possible with whisper-jax?

Thanks

RaulKite avatar Apr 24 '23 12:04 RaulKite

Hey @RaulKite! Not yet - since this is a fairly new Whisper feature, we first need to add this into Hugging Face Transformers, and then propagate the changes on to Whisper JAX. Hoping to have these as soon as possible

sanchit-gandhi avatar Apr 24 '23 15:04 sanchit-gandhi

First off kudos on this achievement- we use google speech and AWS both but this is just stellar performance!

  1. Quick question on the above - is there a timeline for the word time stamps? word time stamps are critical to play video with transcripts and all ASR systems provide it. We are considering using this in production replacing AWS but without word time stamps that might not be possible - if you can provide a timeline we would appreciate it a lot.
  2. And is there a way to detect uh, um in the speech - it will be highly beneficial for educational purposes.

anindyagupta avatar Apr 24 '23 19:04 anindyagupta

Hey @anindyagupta - if anyone in the community would like to take a stab at adding word-level timestamps to 🤗 Transformers I'd be happy to guide the integration process and review PRs! Otherwise, I'm hoping to see to it by maybe next week. The full integration might take ~1.5-2 weeks?

This would be possible through prompting (an ongoing PR in 🤗 Transformers: https://github.com/huggingface/transformers/pull/22496). Again, once this is merged I'll propagate it on to Whisper JAX ASAP!

sanchit-gandhi avatar Apr 26 '23 11:04 sanchit-gandhi

Yes I am interested, anyway this works well https://github.com/linto-ai/whisper-timestamped

rairavi avatar Apr 26 '23 18:04 rairavi

I currently use ^ whisper-timestamped and am looking to migrate to Whisper-JAX because of its fantastic speed. Really appreciate that you've got this on the docket for this repo so quickly. Looking forward to seeing this get integrated.

ferdavid1 avatar May 01 '23 19:05 ferdavid1

Hi, has there been any update in this? It appears the above PR in :hugs: Transformers has been merged. It would be really useful in my own application for sub-second timestamps :)

vvvm23 avatar May 23 '23 19:05 vvvm23

@sanchit-gandhi is there any estimation for integrating both initial prompt and word timestamps?

AvivSham avatar May 24 '23 07:05 AvivSham

Amazing project, super fast transcription, still missing this very important feature for word-by-word timestamps

gkarmas avatar Dec 03 '23 01:12 gkarmas

Is there any news about word level timestamps?

crummenauerca avatar Jun 19 '24 14:06 crummenauerca

Yes would also love to if it was integrated:)

iampickle avatar Jun 23 '24 10:06 iampickle