Jong Wook Kim comments

Results 86 comments of


Jong Wook Kim

Add github action to automatically push to pypi on Release x.y.z commit

Hi, thanks for the PR. I wasn't ready to go fully enterprise open-source with semantic versioning and everything, but date-based versioning seems manageable. The package is now available as [`openai-whisper`...

Integrate Pytorch's TorchDynamo as a passed in callable [1.375x faster on A100] [Needs more benchmarking]

Thanks, I'll close this for now, since it doesn't quite yet work "out of the box" and relying on nightly versions makes things difficult for me to maintain. I'm hoping...

[Do not land] [RFC] 1.375x speedup - Remove control flow from model, small hacks, enable TorchDynamo + TorchInductor

Thanks, I'll close this for now, since it doesn't quite yet work "out of the box" and relying on nightly versions makes things difficult for me to maintain. I'm hoping...

Added --output option

Thanks for the PR! I've renamed the option to `--output_format` and moved some code around.

Uses MPS (Mac acceleration) by default when available

I also see the same errors as others mentioned above, on an M1 Mac running arm64 Python.

Disabled '...' from being generated, since it often gets generated

Thanks for the suggestion; there might be cases where `...` could be useful, e.g. when the speaker is hesitating like "I was... I did something". So I'd keep this enabled...

[README] Add section on 🤗 Transformers

Hi Sanchit and all, as discussed offline, let me close this in favor of a separate post on the Discussions page.

Fix catastrophic timestamp drifting from negative duration via clamping

Thanks for the PR! What I wanted to do instead of this is to mask the timestamp tokens during sampling so that those are conditioned to be monotonically increasing, combined...

word-level timestamps in `transcribe()`

Thanks for the comments, all -- this is work in progress and not quite ready for merging. I'm trying to address both hallucination and performance concerns.

word-level timestamps in `transcribe()`

Hi @IgnacioSan22, the custom DTW implementation in this PR was for the license issue as noted by others and also for the speed. An alternative is to use the timestamp...