VoiceCraft icon indicating copy to clipboard operation
VoiceCraft copied to clipboard

RuntimeError: Calculated padded input size per channel: (6). Kernel size: (7). Kernel size can't be greater than actual input size

Open lukaszliniewicz opened this issue 1 year ago • 2 comments

I'm getting this error regardless of the wav file I use, including the demo file:

RuntimeError: Calculated padded input size per channel: (6). Kernel size: (7). Kernel size can't be greater than actual input size

Have you encountered this before?

lukaszliniewicz avatar Apr 01 '24 23:04 lukaszliniewicz

Is it happening at

    concat_sample = audio_tokenizer.decode(
        [(concat_frames, None)] # [1,T,8] -> [1,8,T]
    )
    gen_sample = audio_tokenizer.decode(
        [(gen_frames, None)]
    )

that means the model generated too few tokens. this shouldn't happen with the well-trained model unless the prompt is extreme hard such as inaudible speech or extreme noise. Plus I already have code that prevent this from happening (search if cur_num_gen <= self.args.encodec_sr // 5 in ./models/voicecraft.py

To verify whether the model only output a few tokens, print gen_frames, it should be [1, 4, T], T should be n sec * 50, i.e. if you expect the output to be longer than 2 sec, T should be bigger than 100

jasonppy avatar Apr 02 '24 00:04 jasonppy

Apologies, it was my own stupidity. I didn't realize that you need to include the transcript of the sample in the prompt! Some generations worked (if the wav sample used was very short, <1s, which got me really confused), and finally I noticed that in the notebook example you prepend the transcript to the prompt...

Anyhow, I got this working natively on Windows with some minor adjustments to path handling in several audiocraft files. I will do some more testing, but eventually I would like to include VoiceCraft as an option in my audiobook generator app (https://github.com/lukaszliniewicz/Pandrator), which is of course non-commercial and open source, as I received a request for it.

lukaszliniewicz avatar Apr 02 '24 16:04 lukaszliniewicz