MeloTTS icon indicating copy to clipboard operation
MeloTTS copied to clipboard

Is it possible to export to onnx?

Open AngelGuevara7 opened this issue 1 year ago • 3 comments

Hello, I just discovered onnx format and its advantages in speed. Has anyone tried to export MeloTTS to onnx format?

AngelGuevara7 avatar Apr 15 '24 10:04 AngelGuevara7

That is possible, maybe you can refer this https://github.com/fishaudio/Bert-VITS2/blob/master/export_onnx.py https://github.com/fishaudio/Bert-VITS2/blob/master/onnx_infer.py

jeremy110 avatar Apr 19 '24 09:04 jeremy110

what's the branch or commit id reversion from bert-vits2 ? i check the models has much difference . eg : TextEncoder ..

walletiger avatar May 12 '24 02:05 walletiger

@walletiger The basic architectures are mostly the same, but Melotts has more languages and uses IPA. The latest BERT-VITS2 has added WAVLM and emotion to the basic architecture. Earlier versions (before 2.0.0) seem to be quite similar.

jeremy110 avatar May 12 '24 06:05 jeremy110

Please have a look at https://github.com/myshell-ai/MeloTTS/issues/164

You can also run the exported ONNX models on android, ios, raspberry pi, etc, using C++.

csukuangfj avatar Jul 17 '24 06:07 csukuangfj

请看一下 #164

您还可以使用 C++ 在 android、ios、raspberry pi 等上运行导出的 ONNX 模型。

我已经导出为onnx了,但是我想根据sid合成不同音色,配置文件里有"n_speakers": 256, "spk2id": { "ZH": 1 }这样的字段,我猜测是不是有256个不同的发言人,还是n_speakers的维度为256,所以我修改n_speakers的值,音色好像没有变化

pengpengtao avatar Jul 18 '24 05:07 pengpengtao

我看 image 这个类里面并没有对speak_id操作,但是源代码里面却能根据修改音色,一般来说男女的音色差距比较,它是怎么区分男女音色的呢

pengpengtao avatar Jul 18 '24 05:07 pengpengtao

中英文模型,只有一个 speaker, 并且,它的 speaker _id, 固定为1.

其他模型,我没试过。

csukuangfj avatar Jul 18 '24 06:07 csukuangfj

中英文模型,只有一个 speaker, 并且,它的 speaker _id, 固定为1.

其他模型,我没试过。

好的,我试试其他模型,

pengpengtao avatar Jul 18 '24 06:07 pengpengtao

如果 speaker id 给错了,生成的音频,没有声音。我调试了很久,才发现这个问题。

希望你能避免这个问题

csukuangfj avatar Jul 18 '24 06:07 csukuangfj

如果 speaker id 给错了,生成的音频,没有声音。我调试了很久,才发现这个问题。

希望你能避免这个问题

刚刚已经尝试过了,添加sid,无论什么sid都没有声音,所以我就比较疑惑了,它不跟据sid输出音色么,再怎么也要两个id才行吧,匹配男女的音色,难道音色提取器特别强大,无论男女的音色都能完全解耦。

pengpengtao avatar Jul 18 '24 06:07 pengpengtao

无论什么sid都没有声音

你试过几个 sid, 就得出了这个结论?

csukuangfj avatar Jul 18 '24 07:07 csukuangfj

总有一个 sid 会有声音的,你是穷举了 1 到 1<<31 这么多个么? 没有的话,你试过从0到10 么?

csukuangfj avatar Jul 18 '24 07:07 csukuangfj

Please have a look at #164

You can also run the exported ONNX models on android, ios, raspberry pi, etc, using C++.

Thanks for your solution!! I tested it with my custom models and it worked perfectly! :) I noticed a 30 MB reduction in model size(from 190MB to 160MB aprox), but the inference speed is almost the same. Did you compare the inference speed between pytorch model and onnx model?

AngelGuevara7 avatar Jul 18 '24 07:07 AngelGuevara7

无论什么sid都没有声音

你试过几个 sid, 就得出了这个结论?

我试了255,200,的sid,那如果都要试的话,也是可以,遍历生成就好,一个一个听一下

pengpengtao avatar Jul 18 '24 08:07 pengpengtao

@pengpengtao

https://github.com/myshell-ai/MeloTTS/blob/144a0980fac43411153209cf08a1998e3c161e10/melo/app.py#L33

需要参考这个,需要用

models[language].hps.data.spk2id

里面的某一个数字。

csukuangfj avatar Jul 18 '24 08:07 csukuangfj

Please have a look at #164 You can also run the exported ONNX models on android, ios, raspberry pi, etc, using C++.

Thanks for your solution!! I tested it with my custom models and it worked perfectly! :) I noticed a 30 MB reduction in model size(from 190MB to 160MB aprox), but the inference speed is almost the same. Did you compare the inference speed between pytorch model and onnx model?

It's great to hear that it works for you.

Did you compare the inference speed between pytorch model and onnx model?

Unfortunately, we have not done that.

csukuangfj avatar Jul 18 '24 08:07 csukuangfj

Please have a look at #164

You can also run the exported ONNX models on android, ios, raspberry pi, etc, using C++.

I'll close the issue because this comment solve it. Feel free to reopen it.

AngelGuevara7 avatar Jul 26 '24 07:07 AngelGuevara7

Please have a look at #164 You can also run the exported ONNX models on android, ios, raspberry pi, etc, using C++.

Thanks for your solution!! I tested it with my custom models and it worked perfectly! :) I noticed a 30 MB reduction in model size(from 190MB to 160MB aprox), but the inference speed is almost the same. Did you compare the inference speed between pytorch model and onnx model?

Can you share your script for converting to ONNX? i also tried converting my english custom model but got some awkward pronunciations

nanaghartey avatar Aug 08 '24 01:08 nanaghartey