SapphireLab comments

Results 75 comments of


                                            SapphireLab

trafficstars

ASR錯誤: Max retries exceeded with url

自动通过 api 下不了的话，试试访问网页手动下载也可以的。 FunASR 需要这三个模型。 ``` https://modelscope.cn/models/iic/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch/files https://modelscope.cn/models/iic/speech_fsmn_vad_zh-cn-16k-common-pytorch/files https://modelscope.cn/models/iic/punc_ct-transformer_zh-cn-common-vocab272727-pytorch/files ``` 然后这三个模型文件夹放到 `tools/asr/models` 里就行。 ![image](https://github.com/RVC-Boss/GPT-SoVITS/assets/36986837/2eca0e60-bc3d-4a1e-8da8-848c5eb67200)

一个关于LangSegment库的神奇bug

成功复现了，只要两个加分之间存在任意(数量)标点就会被识别为阿拉伯语 (ar)，然后被默认过滤器过滤掉 ![image](https://github.com/RVC-Boss/GPT-SoVITS/assets/36986837/2293e522-1053-40fa-99fe-8f180abd06a2)

Add restrictions on file extensions

Why not define `AUDIO_EXTENSION` in a single config file such as `tools/my_utils.py` and use `from xxx import AUDIO_EXTENSION` for reusability? It will be easier when adding new extensions or doing...

ASR推理错误

M40 可以用 cuda12 吗？感觉会有兼容性问题. 报错前面有没有出现半精度的警告, 有的话把 config.py 中的 is_half 改了, 这卡应该不支持半精度运算.

Traceback (most recent call last): File "/Users/wangxiaowei19/Study/GPT/GPT-SoVITS/webui.py", line 872, in <module> app.queue(concurrency_count=511, max_size=1022).launch( File "/usr/local/lib/python3.9/site-packages/gradio/blocks.py", line 2012, in queue raise DeprecationWarning( DeprecationWarning: concurrency_count has been deprecated. Set the concurrency_limit directly on event listeners e.g. btn.click(fn, ..., concurrency_limit=10) or gr.Interface(concurrency_limit=10). If necessary, the total number of workers can be configured via `max_threads` in launch().

你这标题未免太长了，而且问题定位如楼上所言，可见 #527

Optimised graphics card recognition and half-precision recognition

The only problem is as stated in issue #808, it is not recommended to use MPS for training on MacOS.

调整DPO

虽然似乎合理了，但是 #950 ?

ASR cannot working

did you install the miniconda in x86_64 way? check your `conda-info` info if x86_64, it may cause problem in CTtranslated2 to stop the faster whisper.

ASR cannot working

> 达摩的ASR 下载出现了问题。有人和我出现了同样的问题吗？需要怎样解决换个网络吧，能上modelscope的话直接访问然后下载也可以. 可能是api不太对劲.

文字中带有较多重复时，比较容易拖音，空白，漏句

> 另外，关于 top_p 和 top_k 的采样，代码里是先 top_p 再 top_k。个人理解先通过设置一个相对宽松的k值来大致界定一个候选词的范围，再用一个较严格的p值来从这个范围内进一步筛选词汇会不会更好？疑问+1，发现 `top_k_top_p_filtering()` 和 `logits_to_probs()` 两个函数的 top_k top_p 顺序不一样. 一般都该先 top_k 后 top_p 吧.