whisper-jax
whisper-jax copied to clipboard
How to save txt vtt and srt outputs? how to set beam_size, initial_prompt, best_of and other parameters
I have checked main page and kaggle and there is no example of these
In reguler I was doing like below
For whisper jax how can I do?
result = model.transcribe("../input/whisper2/lecture_"+str(lectureId)+".mp3",language="en",beam_size=10,initial_prompt="Welcome to the Software Engineering Courses channel.",best_of=10,verbose=True,temperature=0.0)
# save SRT
language = result["language"]
sub_name = f"/kaggle/working/lecture_"+str(lectureId)+".srt"
with open(sub_name, "w", encoding="utf-8") as srt:
write_srt(result["segments"], file=srt)
# Save output
writing_lut = {
'.txt': whisper.utils.write_txt,
'.vtt': whisper.utils.write_vtt,
'.srt': whisper.utils.write_txt,
}
Hey @FurkanGozukara,
To save the transcriptions as a .txt file, you can do the following:
pred_str = pipeline(...)
with open("output.txt", "w") as text_file:
text_file.write(pred_str)
This will save your predictions to a file called output.txt.
We could definitely extend this to saving .vtt/.srt files. To do so, we'd need to port whisper/utils.py to Whisper JAX.
Hi, @sanchit-gandhi. Thanks for the amazing work you've done.
Regarding the decoding options such as beam_size, initial_prompt, etc. it does seem that it is not possible to change them. Do you confirm? Are there any plans to allow to tune those parameters in the future?
Hey @FurkanGozukara,
To save the transcriptions as a
.txtfile, you can do the following:pred_str = pipeline(...) with open("output.txt", "w") as text_file: text_file.write(pred_str)This will save your predictions to a file called
output.txt.We could definitely extend this to saving
.vtt/.srtfiles. To do so, we'd need to portwhisper/utils.pyto Whisper JAX.
Please implement srt export, i would be very thankful.
Guys, please provide the code to export in the SRT format. I will be grateful.
is there any update on the SRT export please?
ye i didnt spend any time because it lacks this most important feature
srt and vtt output like original whisper which i use daily
Any updates on enabling temperature setting?
Hey @FurkanGozukara,
To save the transcriptions as a
.txtfile, you can do the following:pred_str = pipeline(...) with open("output.txt", "w") as text_file: text_file.write(pred_str)This will save your predictions to a file called
output.txt.We could definitely extend this to saving
.vtt/.srtfiles. To do so, we'd need to portwhisper/utils.pyto Whisper JAX.
Hi. When you will port "utils.py" to gift us ability to generate .srt files? I am very need it, but I'm not good at programming.
agree. Keen on being able to drop back into whisper to write output to SRT, VTT etc.
Hey @FurkanGozukara,
To save the transcriptions as a
.txtfile, you can do the following:pred_str = pipeline(...) with open("output.txt", "w") as text_file: text_file.write(pred_str)This will save your predictions to a file called
output.txt.We could definitely extend this to saving
.vtt/.srtfiles. To do so, we'd need to portwhisper/utils.pyto Whisper JAX.
Hi!
I just saw this kaggle notebook (i didn't test it), maybe you can take a look and take the srt implementation from it?
https://www.kaggle.com/code/idealim/mp3-to-text-and-subtitles-with-whisper-jax/notebook
@sanchit-gandhi
@rRobis thanks a lot for sharing the code, the save_as_srt function works like a charm.