fish-speech issues

[BUG] VITS微调时报错

在vits微调5个epoch后报错，重新开始微调立即报错似乎是validation时的问题，给embedding层传了一个float类型的tensor 命令是 ```bash python fish_speech/train.py --config-name vits_decoder_finetune ``` 修改了yml中的数据集路径以及batch_size pytorch版本2.3.1 cuda12.1 以下是日志 --- >Error executing job with overrides: [] [rank0]: Traceback (most recent call last): [rank0]: File "/home/snowfox/fish_audio/fish_speech/train.py",...

SnowFox4004

bug

[BUG]4090显卡下面推理报错

在4090显卡下面使用webui进行推理会报错，无法推理，但是使用API表现正常。

aofengdaxia

bug

[help] vits 微调多少步合适

vits微调10000步，合成的声音变成蜂鸣声了。 1000步的时候不会。这是训练配置： ![1719540278485](https://github.com/fishaudio/fish-speech/assets/13692249/eed17b01-bbe9-4364-883b-679aaccbc4b4) 音频时长：28分钟

leo3349

bug

[BUG] 安装环境时pytorch下载速度慢

5

Feel free to ask any kind of questions in the issues page, but please use English since other users may find your questions valuable. **Describe the bug** https://github.com/fishaudio/fish-speech/blob/fbe2e3f030d9a2fe5455e56a7a9abb72161f6d0f/install_env.bat#L251-L256 上海交大的pytorch镜像反代似乎不再能够加速下载，从之前的10MB/s变成了现在的20-400kb/s。

Touch-Night

bug

关于调用API 生成流式音频的BUG反馈

11

在调用tool.api时，明明在post请求中加入了streaming参数，并且格式也是wav 但服务端依旧没有分批次返回音频流，而是全部生成完毕之后才返回，于是找到了tool/api.py 发现，InvokeRequest接收的参数中漏掉了streaming ![image](https://github.com/fishaudio/fish-speech/assets/11146882/5c1dad86-ae7f-4450-8484-c270c978fd55) 加上streaming=True之后，api调用正常，能正确返回流式音频数据了

hotdogarea

bug

stale

lora二次訓練

4

參考了pretrain config, 使用lora設置use_speaker: false和較高learning rate合併lora能夠初步說其他語言，但不夠標準。用其作為基底，繼續用lora以較細緻的設定finetune就失敗了。卡在了完成資料準備，Found 650 groups的字句，沒有任何報錯。

Naozumi520

目前版本是不认识繁体字吗，我用繁体字做文本，结果就是乱读或跳过

1

在生成token的时候 -text中掺杂了繁体字，后在生成人声的时候会跳过或乱读

Json0926

bug

[help]Help Needed: Unable to Start Training on Windows

3

**Describe the bug** I cloned the repository https://github.com/fishaudio/fish-speech.git, ran install_env.bat, and successfully launched the WebUI using start.bat. After selecting any model (VQGAN, VITS, LLAMA) and clicking "Start Training," I encounter...

Jyu433

bug

good first issue

使用vqgan和vits编码得到的prompt_tokens不一样[BUG]

2

Feel free to ask any kind of questions in the issues page, but please use English since other users may find your questions valuable. **Describe the bug** A clear and...

1nlplearner

bug

自回归模型幻觉问题

1

在测试这个模型的过程中，发现一些读音错误。例如APP读成aat，推测是由于llama自回归模型的幻觉问题导致。这个问题是否会在更大的llama模型中得到解决或者更多训练数据是否能解决这个问题？在一些特定领域的术语中，加入特定术语的数据对llama进行微调是否可以缓解这些特定术语的发音错误问题？

Wangrui025

enhancement

fish-speech
fish-speech copied to clipboard

Metadata

[BUG] VITS微调时报错

[BUG]4090显卡下面推理报错

[help] vits 微调多少步合适

[BUG] 安装环境时pytorch下载速度慢

关于调用API 生成流式音频的BUG反馈

lora二次訓練

目前版本是不认识繁体字吗，我用繁体字做文本，结果就是乱读或跳过

[help]Help Needed: Unable to Start Training on Windows

使用vqgan和vits编码得到的prompt_tokens不一样[BUG]

自回归模型幻觉问题

← Metadata

Owner

Metadata

fish-speech fish-speech copied to clipboard

Metadata

← Metadata

Owner

Metadata

fish-speech
fish-speech copied to clipboard