Andrei Constantinescu comments

Results 6 comments of


                                            Andrei Constantinescu

Training on a custom dataset

Is this used for transcribing? You can just use whisper to transcribe most languages. Then convert it into hdf5 file and texts and feed it via dynamic batch samppler

[DiT video]

Yes i'd be interested in collaboration too. I already setup a ViVit ( video vision transformer) architecture with this DiT as a reference. If you look at Sora they also...

Phonemizer as dependency

All i see is that you've removed the phonemizer dependency you use english to ipa library. But if i'd like to add languages can I still use phonemizer?

Training speed

Is there a specific model loss/validation loss you employed as a benchmark for convergence?

ValueError: No valid path is found for training.

Hello. What do you mean by two different pairs of files? I have the .qnt.pt and phon.txt and normalized.txt and wav files under my directory data/librosa the config files are...

ValueError: No valid path is found for training.

> Your phenome files need to have between 10 and 50 phonemes in them. Try using shorter audio clips, even 10 second clips can be too long. My training samples...