sherpa-onnx
sherpa-onnx copied to clipboard
add phoonnx models
hello
I am working in my own TTS engine https://github.com/TigreGotico/phoonnx
when using the espeak phonemizer the models are compatible with piper TTS, in fact you already are using my models #2530
with my latest training code the .json changed slightly, so i thought it was time to open an issue about phoonnx
new model:
- https://huggingface.co/OpenVoiceOS/phoonnx_ar-SA_miro_espeak
please note phoonnx is in it's early days and i wouldn't exactly consider it production ready, but it works!
the various phonemizers are still undergoing testing and will need to be a consideration if sherpa decides to support the non-espeak based models
import json
from typing import Any, Dict
import onnx
def add_meta_data(filename: str, meta_data: Dict[str, Any]):
"""Add meta data to an ONNX model. It is changed in-place.
Args:
filename:
Filename of the ONNX model to be changed.
meta_data:
Key-value pairs.
"""
model = onnx.load(filename)
for key, value in meta_data.items():
meta = model.metadata_props.add()
meta.key = key
meta.value = str(value)
onnx.save(model, filename)
def load_config(model):
with open(f"{model}.json", "r") as file:
config = json.load(file)
return config
def generate_tokens(config):
id_map = config["phoneme_id_map"]
with open("tokens.txt", "w", encoding="utf-8") as f:
for s, i in id_map.items():
if s == "\n": # skip invalid token
continue
f.write(f"{s} {i}\n")
print("Generated tokens.txt")
def main():
filename = "miro_ar-SA.onnx"
config = load_config(filename)
alphabet = config["alphabet"]
phonemizer = config["phoneme_type"]
if alphabet != "ipa" or phonemizer != "espeak":
raise RuntimeError("only phoonnx models trained with 'ipa' and 'espeak' are supported")
print("generate tokens")
generate_tokens(config)
print("add model metadata")
meta_data = {
"model_type": "vits",
"comment": "piper", # NOTE: only phoonnx models trained using espeak + ipa
"language": "Arabic",
"voice": config["lang_code"], # e.g., en-us
"has_espeak": 1,
"n_speakers": config["num_speakers"],
"sample_rate": config["audio"]["sample_rate"],
}
print(meta_data)
add_meta_data(filename, meta_data)
main()
arabic female model
https://huggingface.co/OpenVoiceOS/phoonnx_ar-SA_dii_espeak
arabic male V2 model , trained on a better dataset
https://huggingface.co/OpenVoiceOS/phoonnx_ar-SA_miro_espeak_V2
We also published a guest blog post about our collaboration with visually impaired arabic users to create these models https://blog.openvoiceos.org/posts/2025-10-01-arabic_tts_collaboration
with this PR https://github.com/TigreGotico/phoonnx/pull/19
models should be sherpa compatible out of the box
basque male voice https://huggingface.co/OpenVoiceOS/phoonnx_eu-ES_miro_espeak
should already include the metadata keys expected by sherpa in the model itself, also provides tokens.txt directly in the same repo
basque female voice https://huggingface.co/OpenVoiceOS/phoonnx_eu-ES_dii_espeak
samples for both basque voices here https://blog.openvoiceos.org/posts/2025-10-06-phoonnx