spicysama comments

Results 123 comments of


                                            spicysama

[Feature]能否提供api的使用文档及示例，比如说对接开源阅读tts之类的例子？

这是示例，支持流式或非流式合成： ![image](https://github.com/fishaudio/fish-speech/assets/122108331/4ed3d3a4-9563-44ba-9d49-6a25879441b1) 代码已pr

[Feature]能否提供api的使用文档及示例，比如说对接开源阅读tts之类的例子？

@unlimit999 抱歉，我需要优先保证官方适配，先删除，你可以补充在我后面。不用fastapi, 用官方自带的kui，待会放上示例首先打开`API_FLAGS.txt`, 修改成如下图所示： ![image](https://github.com/fishaudio/fish-speech/assets/122108331/a50fbd51-ee2c-4b14-aaed-015eae3a7472) 然后点击`start.bat`运行API服务 ![image](https://github.com/fishaudio/fish-speech/assets/122108331/a7f99e90-c85a-4e5f-8504-8eda371e1e60) 之后打开电脑的网络设置 ![image](https://github.com/fishaudio/fish-speech/assets/122108331/56176c1b-6342-468d-bb67-edef8e186744) 查询IP地址 ![image](https://github.com/fishaudio/fish-speech/assets/122108331/66e80385-6b90-4817-8da3-831ccaac4cc3) 然后打开“开源阅读”APP，根据上述IP填写信息如下即可开始朗读。具体参数一会儿附上： ![04f671f55491e3d0949cd0130bc59e11](https://github.com/fishaudio/fish-speech/assets/122108331/ba77fb1e-5673-46f0-96cc-3e8f6a1ea251)

[Feature]能否提供api的使用文档及示例，比如说对接开源阅读tts之类的例子？

经过测试，是没问题的，要对api.py进行一些修改，详情请见最新PR。

[Feature]能否提供api的使用文档及示例，比如说对接开源阅读tts之类的例子？

![image](https://github.com/fishaudio/fish-speech/assets/122108331/0776892b-6740-4b3e-aaee-4418264e6120) `Content-Type`改成`audio/wav`

[Feature]能否提供api的使用文档及示例，比如说对接开源阅读tts之类的例子？

使用参考音频和文本时，需要做如下步骤： 0. 更新代码。 1. 项目根目录下创建一个`ref_data`文件夹，然后创建一些角色名命名的文件夹 ![image](https://github.com/fishaudio/fish-speech/assets/122108331/f6e980e9-d20c-4d86-90a6-eaac5c6859f9) 2. 给音频的情绪分好类，放到不同的情绪命名的子文件夹中。要求同时有`.lab`标注文件和`.wav`音频文件。 ![image](https://github.com/fishaudio/fish-speech/assets/122108331/50c39df5-e58f-4102-9172-eeb9ee68c39d) 3. 点击`run_cmd.bat`, 输入 ```python python tools\gen_ref.py ``` 这会产生一个`ref_data.json`文件，里面存放的是用到的音频和标注路径。 4. 将`API_FLAGS.txt`设置如下: ![image](https://github.com/fishaudio/fish-speech/assets/122108331/960ed618-b56d-4193-b061-55fb30a86beb) 5. 点击`start.bat`运行api服务器。 6. 回到开源阅读，配置如下：配置说明：跟第1步的配置有关。声源：胡桃，情绪：happy。效果：随机选择happy文件夹下的一条音频作为参考。 ![31dc947f308dff527bb0d7e98e793c1](https://github.com/fishaudio/fish-speech/assets/122108331/64da75bf-b8ed-4eb5-9b68-ce7dad4a0f3b) 7. 阅读，启动 ![image](https://github.com/fishaudio/fish-speech/assets/122108331/b57f7f0b-c666-42ee-a3db-b46b34fb6c0d)

your Hugging face demo is super fast compare to my local inference

> probably not because i try to run with the compile flag : python [tools/run_webui.py](http://tools/run_webui.py) --llama-checkpoint-path checkpoints/fish-speech-1.5 --decoder-checkpoint-path checkpoints/fish-speech-1.5/firefly-gan-vq-fsq-8x1024-21hz-generator.pth --compile > > ## I get the error : > 2024-12-14...

spicysama

[Feature]能否提供api的使用文档及示例，比如说对接开源阅读tts之类的例子？

[Feature]能否提供api的使用文档及示例，比如说对接开源阅读tts之类的例子？

[Feature]能否提供api的使用文档及示例，比如说对接开源阅读tts之类的例子？

[Feature]能否提供api的使用文档及示例，比如说对接开源阅读tts之类的例子？

[Feature]能否提供api的使用文档及示例，比如说对接开源阅读tts之类的例子？

your Hugging face demo is super fast compare to my local inference

使用LORA

使用LORA

使用LORA

使用LORA