Osman KARABULUT

Results 14 comments of Osman KARABULUT

I couldn't find model.train too . I understand that the transition between train and inference modes is done differently. Would you try other variants of CosyVoice-300m for your problem?

I think If I remember right. These are in executor.py

I got this error. RuntimeError: Error(s) in loading state_dict for TransformerLM: Missing key(s) in state_dict: "text_embedding.weight", "text_encoder_affine_layer.weight", "text_encoder_affine_layer.bias", "llm_embedding.weight", "llm_decoder.weight", "llm_decoder.bias", "speech_embedding.weight", "spk_embed_affine_layer.weight", "spk_embed_affine_layer.bias". Unexpected key(s) in state_dict: "module", "buffer_names",...

I added the following lines to cosyvoice.yaml, but the training doesn't resume from the specified checkpoint. Instead, it starts from epoch 0, step 0. current_epoch: 1 # added current_step: 30000...

TEST 1 from cosyvoice.cli.cosyvoice import CosyVoice from cosyvoice.utils.file_utils import load_wav import torchaudio cosyvoice = CosyVoice('./pretrained_models/CosyVoice-300M-TR') import torch from cosyvoice.utils.train_utils import ( init_distributed, init_dataset_and_dataloader, init_optimizer_and_scheduler, init_summarywriter, save_model, wrap_cuda_model, check_modify_and_save_config) from hyperpyyaml...

@aluminumbox

Hi @SongYao2, how can I fix that? I cant resume training just for this reason.

> check whisper tokenizer, add language token at sentence start. For example, I want to use the voice of a Turkish speaker while reading text in a different language. Do...

He says very meaningless sounds. The voice transcription is successful, but it does not say the words correctly. I couldn't understand what language you were speaking ``` # cross_lingual usage...