Rishikesh (ऋषिकेश)

Results 32 issues of Rishikesh (ऋषिकेश)

Have you try this on multi-speaker way ?

Hi @rosinality, hope you are doing well! I really like your repo, especially for dataloader and augmentation part for image classification. I am not majorly working on Vision field but...

Hi @cantabile-kwok , I am curious to know have you tried this model for zero shot voice conversion use case ? Idea is very simple: ``` Source voice speech ->...

Hi @cantabile-kwok , I have also implemented UniCATS's vec2wav but that model is too slow, so I am curious to know the inference speed of this model. Actually, I am...

Have trained `update_v2` branch on : * Extracted Semantic token from HuBert Large layer 16 with 1024 cluster Kmean. (`50 tok/sec`) * Extracted Acoustic token from Encodec 24 khz sample...

Hi @yangdongchao, I have checked your research paper and it's quite interesting to see that one model to do all. Are you planning to release training code , or just...

As I am analyzing new HiFi-codec code I encountered three small bugs: 1. Torchaudio Melspectrogram : Here : https://github.com/yangdongchao/AcademiCodec/blob/3ee7baf94387e72de5777a7b824e401d1663cc11/HiFi-Codec/train.py#L31 `MelSpectrogram` not imported before use : ``` from torchaudio.transforms import MelSpectrogram...

This paper : https://arxiv.org/pdf/2401.01099.pdf , suggest better masking strategy with Grouped Acoustic Token like HiFi-Codec which results far better quality that Soundstorm.

Hi @ex3ndr , I check out your code here: https://github.com/ex3ndr/supervoice-gpt/blob/master/train_tokenizer.py I saw you have tried two training one with text and the other is with phonemes any specific reason you...

Hi @ex3ndr Hope you are doing well, I am looking to add more languages and accents to your model, are you planning to add a finetuning script ?