fish-speech icon indicating copy to clipboard operation
fish-speech copied to clipboard

SOTA Open Source TTS

Results 304 fish-speech issues
Sort by recently updated
recently updated
newest added

3080 TI and it's still slow even with --compile it takes like 120 seconds at least even with small text input

bug

Hi, Is there any experiments about LLM training speech input? there are two kind of inputs: the indices of codebook in codec, as a singel integer value, or the indexed...

pip install audio-seperator 报错。库里是这样写的。 pip install audio-separator 是可以的。

bug

The conflict is caused by: transformers 4.35.2 depends on tokenizers=0.14 faster-whisper 0.8.0 depends on tokenizers==0.13.* transformers 4.35.2 depends on tokenizers=0.14 faster-whisper 0.7.1 depends on tokenizers==0.13.* transformers 4.35.2 depends on tokenizers=0.14...

bug

你好,我想训练一个法语的tts,不知道是否需要修改代码?如何修改可以支持。另外想咨询下大概需要多少小时的干声可以训练出来一个比较好的tts?这个tts是专有领域的(科技),不需要那么强的泛化。

enhancement

![image](https://github.com/user-attachments/assets/06fdd6ba-0295-4bff-852b-65aa1ed12995)

bug

This pull request addresses an issue in tools/vqgan/inference.py where the import statement for AUDIO_EXTENSIONS was incorrect. The import statement was originally: ```python from fish_speech.utils.file import AUDIO_EXTENSIONS ``` It has been...

**Is this PR adding new feature or fix a BUG?** Add feature / Fix BUG. **Is this pull request related to any issue? If yes, please link the issue.** #xxx

训练t2s的速度很慢,大约0.09it/s,我使用的GPU为8卡RTX A6000,batch size 为16,请问这个训练速度正常吗? 我用lightning profiler统计了一下,在backward和step的时候耗时最长 这个是用advanced分析的backward和step的结果 ``` Profile stats for: [Strategy]DDPStrategy.backward rank: 0 190 function calls (185 primitive calls) in 43.795 seconds Ordered by: cumulative time ncalls tottime percall...

enhancement