KAN-TTS
KAN-TTS copied to clipboard
KAN-TTS is a speech-synthesis training framework, please try the demos we have posted at https://modelscope.cn/models?page=1&tasks=text-to-speech
Original Traceback (most recent call last): File "/home/akira/anaconda3/envs/maas/lib/python3.7/site-packages/torch/utils/data/_utils/worker.py", line 287, in _worker_loop data = fetcher.fetch(index) File "/home/akira/anaconda3/envs/maas/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 44, in fetch data = [self.dataset[idx] for idx in possibly_batched_index] File "/home/akira/anaconda3/envs/maas/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py",...
在进行特征提取的步骤中,程序运行正常,没有报错和warming,但是最终生成的文件少了一个se的文件夹。所使用的tts-autolabel是1.1.7,modelscole是1.8.1。
你好,能分发一下 ttsfrd 的aarch64版本吗,在tts模型部署的链条上,貌似就差这个的aarch64版本,就可以在各种边缘计算设备上跑通了
Hi,想问一下关于hifigan结构设计上,用到了原始hifigan结构中的transpose_upsamples外,还叠加了nn.Upsample的出发点是什么?这么做的好处是结果更稳定么? 感觉把这两个结构堆叠在一起增加了计算量?
OS: centos7.9 Python/C++ Version:python3.9 gcc4.8.5 Package Version:pytorch==1.13.1、modelscope==1.5.2、kantts==1.0.0、torchaudio==0.13.1 Model: speech_personal_sambert-hifigan_nsf_tts_zh-cn_pretrain_16k Command: ``` from modelscope.metainfo import Trainers from modelscope.trainers import build_trainer from modelscope.utils.audio.audio_utils import TtsTrainType pretrained_model_id = 'damo/speech_personal_sambert-hifigan_nsf_tts_zh-cn_pretrain_16k' dataset_id = "./output_training_data/" pretrain_work_dir...
Hi~: Thanks for the great job!But we did not find an open source license in the project. Is there any content about the license?
(/media/lab-hp/B23AB5DD3AB59F33/condaenv/maas) lab-hp@labhp-HP:~/桌面/KAN-TTS$ python ./kantts/bin/text_to_wav.py --txt test.txt --output_dir res --res_zip speech_sambert-hifigan_tts_zh-cn_multisp_pretrain_16k/resource.zip --am_ckpt speech_sambert-hifigan_tts_zh-cn_multisp_pretrain_16k/basemodel_16k/sambert/ckpt/checkpoint_980000.pth --voc_ckpt speech_sambert-hifigan_tts_zh-cn_multisp_pretrain_16k/basemodel_16k/hifigan/ckpt/checkpoint_2000000.pth --speaker xiaoyu 2023-07-24:22:10:22, INFO [text_to_wav.py:97] Converting text to symbols... Load pinyin_en_mix_dict failed Load pinyin_en_mix_dict failed Load...
# 特征提取 python kantts/preprocess/data_process.py --voice_input_dir ptts_spk0_autolabel --voice_output_dir training_stage/test_male_ptts_feats --audio_config kantts/configs/audio_config_se_16k.yaml --speaker F7 --se_model speech_personal_sambert-hifigan_nsf_tts_zh-cn_pretrain_16k/basemodel_16k/speaker_embedding/se.* # 扩充epoch stage0=training_stage voice=test_male_ptts_feats cat $stage0/$voice/am_valid.lst >> $stage0/$voice/am_train.lst lines=0 while [ $lines -lt 400 ] do...
inputs_text_embedding + pitch_embeddings + energy_embeddings RuntimeError: The size of tensor a (50) must match the size of tensor b (411) at non-singleton dimension 1 2023-07-03:11:07:19 INFO [trainer.py:903] torch.Size([32, 50, 4])...