Rishikesh (ऋषिकेश)

Results 160 comments of Rishikesh (ऋषिकेश)
trafficstars

Yes Greedy is not enough, because this task is many to many conversions which is quite hectic for single run greedy solutions. My training loss is around 5 (ideally loss...

@bharani-y Teacher training set is a knowledge distillation method which called teacher student learning, where we are training a large Teacher model to learn probability distribution of the complex data,...

> PS. I dived into some of the NAR (non-autoregressive) machine translation papers and the consensus was that training a NAR model (and they use even more "tricks" than SoundStorm)...

@bharani-y are you same `dataloader` and `variable random window` as in my repo ? Also which semantic tokenizer you are using and how many clusters in your semantic dataset ?

Currently I am training model on Large Libri-TTS dataset from here : https://huggingface.co/datasets/collabora/whisperspeech/tree/main

> From my experiment I confirm that just doing some very naive semantic tokens upsampling using hubert (50hz) to match encodec (75hz) works, can get some reasonable voices. My own...

> below I provide the core code and one sample, which I think is very close to the paper's description > https://github.com/feng-yufei/shared_debugging_code/blob/main/soundstorm.py, hope it can be useful @feng-yufei sample is...

> For experiment on larger dataset I tried LibriTTS 100/360/500 merged together, the quality is strangely bad.(50% top 10 training accuracy while LJspeech has 65%). I have also trained on...

@p0p4k sample sounds good, I think with more training it will getting lot better. I think multi-linguility is easy to implement in this repo. I think problem occurs when you...

Currently facing this issue : ``` run.sh: line 37: utils/parse_options.sh: No such file or directory Prepare LibriTTS dataset split the data for 1 GPUs cat: data/val/wav.scp: No such file or...