whisper-jax icon indicating copy to clipboard operation
whisper-jax copied to clipboard

How to save txt vtt and srt outputs? how to set beam_size, initial_prompt, best_of and other parameters

Open FurkanGozukara opened this issue 2 years ago • 12 comments

I have checked main page and kaggle and there is no example of these

In reguler I was doing like below

For whisper jax how can I do?

        result = model.transcribe("../input/whisper2/lecture_"+str(lectureId)+".mp3",language="en",beam_size=10,initial_prompt="Welcome to the Software Engineering Courses channel.",best_of=10,verbose=True,temperature=0.0)

        # save SRT

        language = result["language"]
        sub_name = f"/kaggle/working/lecture_"+str(lectureId)+".srt"
        with open(sub_name, "w", encoding="utf-8") as srt:
            write_srt(result["segments"], file=srt)

        # Save output
        writing_lut = {
            '.txt': whisper.utils.write_txt,
            '.vtt': whisper.utils.write_vtt,
            '.srt': whisper.utils.write_txt,
        }

FurkanGozukara avatar Apr 23 '23 18:04 FurkanGozukara

Hey @FurkanGozukara,

To save the transcriptions as a .txt file, you can do the following:

pred_str = pipeline(...)

with open("output.txt", "w") as text_file:
    text_file.write(pred_str)

This will save your predictions to a file called output.txt.

We could definitely extend this to saving .vtt/.srt files. To do so, we'd need to port whisper/utils.py to Whisper JAX.

sanchit-gandhi avatar Apr 24 '23 14:04 sanchit-gandhi

Hi, @sanchit-gandhi. Thanks for the amazing work you've done.

Regarding the decoding options such as beam_size, initial_prompt, etc. it does seem that it is not possible to change them. Do you confirm? Are there any plans to allow to tune those parameters in the future?

ruimaia avatar Apr 27 '23 10:04 ruimaia

Hey @FurkanGozukara,

To save the transcriptions as a .txt file, you can do the following:

pred_str = pipeline(...)

with open("output.txt", "w") as text_file:
    text_file.write(pred_str)

This will save your predictions to a file called output.txt.

We could definitely extend this to saving .vtt/.srt files. To do so, we'd need to port whisper/utils.py to Whisper JAX.

Please implement srt export, i would be very thankful.

rRobis avatar May 02 '23 16:05 rRobis

Guys, please provide the code to export in the SRT format. I will be grateful.

tropanets avatar May 04 '23 07:05 tropanets

is there any update on the SRT export please?

nhan000 avatar May 28 '23 22:05 nhan000

ye i didnt spend any time because it lacks this most important feature

srt and vtt output like original whisper which i use daily

FurkanGozukara avatar May 28 '23 22:05 FurkanGozukara

Any updates on enabling temperature setting?

troublesprouter avatar Jun 14 '23 00:06 troublesprouter

Hey @FurkanGozukara,

To save the transcriptions as a .txt file, you can do the following:

pred_str = pipeline(...)

with open("output.txt", "w") as text_file:
    text_file.write(pred_str)

This will save your predictions to a file called output.txt.

We could definitely extend this to saving .vtt/.srt files. To do so, we'd need to port whisper/utils.py to Whisper JAX.

Hi. When you will port "utils.py" to gift us ability to generate .srt files? I am very need it, but I'm not good at programming.

CyberAndrew avatar Jun 21 '23 20:06 CyberAndrew

agree. Keen on being able to drop back into whisper to write output to SRT, VTT etc.

sovdevs avatar Jul 16 '23 12:07 sovdevs

Hey @FurkanGozukara,

To save the transcriptions as a .txt file, you can do the following:

pred_str = pipeline(...)

with open("output.txt", "w") as text_file:
    text_file.write(pred_str)

This will save your predictions to a file called output.txt.

We could definitely extend this to saving .vtt/.srt files. To do so, we'd need to port whisper/utils.py to Whisper JAX.

Hi!

I just saw this kaggle notebook (i didn't test it), maybe you can take a look and take the srt implementation from it?

https://www.kaggle.com/code/idealim/mp3-to-text-and-subtitles-with-whisper-jax/notebook

rRobis avatar Jul 16 '23 12:07 rRobis

@sanchit-gandhi

rRobis avatar Jul 22 '23 17:07 rRobis

@rRobis thanks a lot for sharing the code, the save_as_srt function works like a charm.

Philipp-Sc avatar Aug 21 '23 12:08 Philipp-Sc