Akash Mahajan comments

Results 7 comments of


                                            Akash Mahajan

tdrz and coreml support?

Hey @akeybl thanks for the cc! i was on break for a bit, hence the delay. Looks like i missed the coreml conversion in this PR https://github.com/ggerganov/whisper.cpp/pull/1001 I'll take a...

tdrz and coreml support?

1). Regarding a finetuned `tiny.en-tdrz` , i'd tried it but it didn't actually work very well. Likely because it is a very weirdly shaped model (token embeddings are [>50% of...

whisper : mark speakers/voices (diarization)

Hi @ggerganov (and other maintainers of this awesome project!) - you might be interested in an early prototype that covers @SpusellaLo's use case over at https://github.com/akashmjn/tinydiarize This was designed keeping...

whisper : mark speakers/voices (diarization)

Exciting to hear back so soon! 🥳 I'm going to be travelling next couple of days, so will take a closer look after i'm back on Monday and hit you...

whisper : mark speakers/voices (diarization)

Thanks for the effort @pratikmohanty. The `small.en-tdrz` checkpoint has the same structure, so it should convert and decode as normal. However to surface `` tokens, edits are required to inference...

whisper : mark speakers/voices (diarization)

For anyone keen to give it a spin, I have an early hack over at https://github.com/akashmjn/whisper.cpp/tree/tdrz-hack-1 ``` make ./models/download-ggml-model.sh small.en-tdrz make samples ./main -m models/ggml-small.en-tdrz.bin -f samples/a13.wav ``` After running...

whisper : mark speakers/voices (diarization)

> > @ggerganov When running whisper.cpp, I get the speaker information only on the stdout result (I think it is VTT format), but the output JSON file does not include...