abelbabel comments

Results 16 comments of


                                            abelbabel

whisper : mark speakers/voices (diarization)

yeah, also saw this https://github.com/openai/whisper/discussions/264 Seems as if they do it with two runs: one for the spoken text, one for the speakers and then merging the results.

Repeating parts of text instead of transcribing - more than an hour long files

`-bo 7`, `-bo 10`, `-bo 15` and changing from `-O3` to `-O2` did not do the trick for me

Repeating parts of text instead of transcribing - more than an hour long files

similar to #471

Convert hugginface model to ggml?

> Is it possible to have the convert script support hugginface format like the one here https://huggingface.co/openai/whisper-medium/tree/main ? The usecase is to run fine tuned models with cpp. I don't...

whisper : mark speakers/voices (diarization)

> Personally, id be more than happy for whisper to just do speaker detection based on left & right channels on a stereo audio file. But I can achieve this...

whisper : mark speakers/voices (diarization)

> I've done some limited testing and was able to achieve reasonable split via `pyannote`. Bolting it all together is a different story though. @savchenko Could you give a small...

[Feature] recognize data coming via pipe stream

Sorry, this does not work for me. For example when piping `gb0.wav` (with small model) I get ``` system_info: n_threads = 4 / 8 | AVX = 1 | AVX2...

[Feature] recognize data coming via pipe stream

Does this work with continuous data from pipe for you too? At my site it seems to "wait" forever ... For example: `ffmpeg -loglevel -8 -i 'https://a.files.bbci.co.uk/media/live/manifesto/audio/simulcast/dash/nonuk/dash_low/cfs/bbc_world_service.mpd' -map_channel 0.0.0 -f...

[Feature] recognize data coming via pipe stream

Hi, I still want to emphasize the utility of a more general approach via pipe. Think of a inference-machine (with proper hardware) that should be used remotely by other processes...

Silly script: BBC world service streaming text

In the result it may become what I was looking for, but I would consider this a workaround ... to me it seems kind of breaking how one expects unix-programs...