VoiceCraft
VoiceCraft copied to clipboard
gradio port
I did not like having to mess with jupyter and having to run whisper separately, so I made a gradio version. Will submit pull request eventually. you can try it out here for now. note that the conda env is slightly different in my fork https://github.com/friendlyFriend4000/VoiceCraft
Thanks! Looking forward to it!
Can you make a Colab Notebook for this?
@friendlyFriend4000 thanks for doing this! I have actually problems in using it (ubuntu 22-04). I get errors like:
WARNING:phonemizer:words count mismatch on 100.0% of the lines (1/1)
RuntimeError: Calculated padded input size per channel: (6). Kernel size: (7). Kernel size can't be greater than actual input size
I am happy in collaborating with you to test and sort out problems like this! Maybe we need to prepare some more instructions about installing and using is ?
@friendlyFriend4000 thanks for doing this! I have actually problems in using it (ubuntu 22-04). I get errors like:
WARNING:phonemizer:words count mismatch on 100.0% of the lines (1/1)
RuntimeError: Calculated padded input size per channel: (6). Kernel size: (7). Kernel size can't be greater than actual input sizeI am happy in collaborating with you to test and sort out problems like this! Maybe we need to prepare some more instructions about installing and using is ?
you can ignore the first error. For the second one I think that your cut off timing slightly beyond the 'expected' transcript length. try decreasing the cut off timing by a couple miliseconds
Sorry to bother you again, but the "Output Audio generated" is between 0 to 2 seconds and just scrambled words. probably I am the problem and not doing something i shold do..
Sorry to bother you again, but the "Output Audio generated" is between 0 to 2 seconds and just scrambled words. probably I am the problem and not doing something i shold do..
there is a complete better version of a gradio implementation on a pr right now