Daniel Holler
Daniel Holler
Was this issue solved? I have the same problem. 1. Took the pretrained YourTTS model: mutlilingual-multi-dataset-your_tts 2. Fine tuned on new speaker for 10k steps 3. Stopped training just to...
@delete-exe Did you find a solution to this? Same issue here.
I fix > did you guys ever find a solution? I fixed it by running it on a different machine. AWS EC2 instance with A10G card. Didn't work on my...
Hey @ggbangyes , did you manage to figure out if this was possible?
I would be happy to see an implementation of this as well! :-)
@Shiro836 @benjismith For getting word-level timestamps I had great success using Kalpy (https://github.com/mmcauliffe/kalpy), which is a low-level wrapper of Kaldi (C library for speech processing). The author of Kalpy is...
I'm going to try getting a very basic thing to run. Currently, there are a couple issues: - LoRA parameter sizes don't match properly with the token embeddings and output...
> > I'm going to try getting a very basic thing to run. Currently, there are a couple issues: > > LoRA parameter sizes don't match properly with the token...
> > Makes sense to just work with stage 1 with cross entropy loss, I just wasn't able to get there - so thanks for that insight. > > Not...
Just committed - here's an overview: - Switched from Adam to SGD optimizer - Modified DataLoader to return first two Encodec token hierarchies as a flattened interleaved tensor (let me...