VITS model Conversion to ONNX and TFLite.
Hello, I'm trying to convert this VITS pytorch model to TFLite version, I'm getting stuck while converting to ONNX and then to TF. Issues are listed below with all the required, anyone can solve this or have anyone tried this before ?
Code for ONNX and TF conversion is as below:
import numpy as np
import onnxruntime
import commons
import utils
import torch
from torch.utils.data import DataLoader
from onnx_tf.backend import prepare
import onnx
from text.symbols import symbols
from models import SynthesizerTrn, MultiPeriodDiscriminator
from data_utils import TextAudioLoader, TextAudioCollate, TextAudioSpeakerLoader, TextAudioSpeakerCollate,
DistributedBucketSampler
from scipy.io.wavfile import write
def to_numpy(tensor):
return tensor.detach().cpu().numpy() if tensor.requires_grad else tensor.cpu().numpy()
hps = utils.get_hparams_from_file("./configs/ljs_base.json")
net_g = SynthesizerTrn( len(symbols), hps.data.filter_length // 2 + 1, hps.train.segment_size // hps.data.hop_length, **hps.model) net_g.eval()
train_dataset = TextAudioLoader(hps.data.training_files, hps.data) train_sampler = DistributedBucketSampler( train_dataset, hps.train.batch_size, [32, 300, 400, 500, 600, 700, 800, 900, 1000], num_replicas=1, rank=0, shuffle=True) collate_fn = TextAudioCollate() train_loader = DataLoader(train_dataset, num_workers=0, shuffle=False, pin_memory=True, collate_fn=collate_fn, batch_sampler=train_sampler) output_model = [] inputs = [] print("exporting -------") for batch_idx, (x, x_lengths, spec, spec_lengths, y, y_lengths) in enumerate(train_loader): # print(x.shape, x_lengths, spec.shape, spec_lengths[0]) inputs = x, x_lengths, spec, spec_lengths output_model = net_g(x, x_lengths, spec, spec_lengths) net_g.to(torch.float32) with torch.no_grad(): torch.onnx.export(net_g, (x, x_lengths, spec, spec_lengths), "vits-ljs-org.onnx", export_params=True, #, keep_initializers_as_inputs=True) store the trained parameter weights inside the model file opset_version=10, # the ONNX version to export the model to do_constant_folding=True, keep_initializers_as_inputs=True, verbose=False, input_names=["enc_p", "x_lengths.1", "inputs.101", "x_lengths"], output_names=["dp.convs.norms_2.2"], dynamic_axes={"enc_p": {0: 'BATCH_SIZE'}, "x_lengths.1": {0: 'BATCH_SIZE'}, "inputs.101": {0: 'BATCH_SIZE'}, "x_lengths": {0: 'BATCH_SIZE'}, "dp.convs.norms_2.2": {0: 'BATCH_SIZE'}}) break
print("exported ONNX model ------")
---> For opeset version 10, I'm getting following error while converting to ONNX:
torch.onnx.errors.SymbolicValueError: Unsupported: ONNX export of Pad in opset 9. The sizes of the padding must be constant. Please try opset version 11. [Caused by the value '921 defined in (%921 : int[] = prim::ListConstruct(%864, %864, %pad_length, %pad_length, %864, %864), scope: models.SynthesizerTrn::/models.TextEncoder::enc_p/attentions.Encoder::encoder/attentions.MultiHeadAttention::attn_layers.0 )' (type 'List[int]') in the TorchScript graph. The containing node has kind 'prim::ListConstruct'.]
Inputs:
#0: 864 defined in (%864 : Long(device=cpu) = onnx::Constant[value={0}](), scope: models.SynthesizerTrn::/models.TextEncoder::enc_p
) (type 'Tensor')
#1: 864 defined in (%864 : Long(device=cpu) = onnx::Constant[value={0}](), scope: models.SynthesizerTrn::/models.TextEncoder::enc_p
) (type 'Tensor')
#2: pad_length defined in (%pad_length : Long(requires_grad=0, device=cpu) = onnx::Sub(%895, %915), scope: models.SynthesizerTrn::/models.TextEncoder::enc_p/attentions.Encoder::encoder/attentions.MultiHeadAttention::attn_layers.0 # Desktop/cadence/vits/attentions.py:203:0
) (type 'Tensor')
#3: pad_length defined in (%pad_length : Long(requires_grad=0, device=cpu) = onnx::Sub(%895, %915), scope: models.SynthesizerTrn::/models.TextEncoder::enc_p/attentions.Encoder::encoder/attentions.MultiHeadAttention::attn_layers.0 # Desktop/cadence/vits/attentions.py:203:0
) (type 'Tensor')
#4: 864 defined in (%864 : Long(device=cpu) = onnx::Constant[value={0}](), scope: models.SynthesizerTrn::/models.TextEncoder::enc_p
) (type 'Tensor')
#5: 864 defined in (%864 : Long(device=cpu) = onnx::Constant[value={0}](), scope: models.SynthesizerTrn::/models.TextEncoder::enc_p
) (type 'Tensor')
Outputs:
#0: 921 defined in (%921 : int[] = prim::ListConstruct(%864, %864, %pad_length, %pad_length, %864, %864), scope: models.SynthesizerTrn::/models.TextEncoder::enc_p/attentions.Encoder::encoder/attentions.MultiHeadAttention::attn_layers.0
) (type 'List[int]')
---> After this I tried to export this model using opset version 12, I got following warning while loading model for ONNX inference:
envs/onnx-tf/lib/python3.9/site-packages/torch/onnx/utils.py:1703: UserWarning: The exported ONNX model failed ONNX shape inference. The model will not be executable by the ONNX Runtime. If this is unintended and you believe there is a bug, please report an issue at https://github.com/pytorch/pytorch/issues. Error reported by strict ONNX shape inference: [ShapeInferenceError] (op_type:Concat, node name: /enc_p/encoder/attn_layers.0/Concat_17): inputs has inconsistent type tensor(float) (Triggered internally at ../torch/csrc/jit/serialization/export.cpp:1484.) _C._check_onnx_proto(proto)
---> After this I tried to export this model using opset version 13, I was able to remove above error, but while converting to TF model, I got error list as following:
WARNING:absl:input.1 is not a valid tf.function parameter name. Sanitizing to input_1.
WARNING:absl:x_lengths.1 is not a valid tf.function parameter name. Sanitizing to x_lengths_1.
WARNING:absl:input.101 is not a valid tf.function parameter name. Sanitizing to input_101.
Traceback (most recent call last):
File "Desktop/cadence/vits/torch-to-tf.py", line 109, in
File "onnx-tensorflow/onnx_tf/backend_tf_module.py", line 99, in __call__ *
output_ops = self.backend._onnx_node_to_tensorflow_op(onnx_node,
File "onnx-tensorflow/onnx_tf/backend.py", line 347, in _onnx_node_to_tensorflow_op *
return handler.handle(node, tensor_dict=tensor_dict, strict=strict)
File "onnx-tensorflow/onnx_tf/handlers/handler.py", line 59, in handle *
return ver_handle(node, **kwargs)
File "onnx-tensorflow/onnx_tf/handlers/backend/gather.py", line 43, in version_13 *
return cls._common(node, **kwargs)
File "onnx-tensorflow/onnx_tf/handlers/backend/gather.py", line 23, in _common *
indices = kwargs["tensor_dict"][node.inputs[1]]
KeyError: 'input.1'
---> After that I tried to convert model after renameing nodes to ONNX to TF compatible names, I got following error: File "/home/tushar/onnx-tensorflow/onnx_tf/backend_tf_module.py", line 99, in call * output_ops = self.backend._onnx_node_to_tensorflow_op(onnx_node, File "/home/tushar/onnx-tensorflow/onnx_tf/backend.py", line 347, in _onnx_node_to_tensorflow_op * return handler.handle(node, tensor_dict=tensor_dict, strict=strict) File "/home/tushar/onnx-tensorflow/onnx_tf/handlers/handler.py", line 59, in handle * return ver_handle(node, **kwargs) File "/home/tushar/onnx-tensorflow/onnx_tf/handlers/backend/conv.py", line 15, in version_11 * return cls.conv(node, kwargs["tensor_dict"]) File "/home/tushar/onnx-tensorflow/onnx_tf/handlers/backend/conv_mixin.py", line 30, in conv * x_rank = len(x.get_shape()) ValueError: Cannot take the length of shape with unknown rank.
---> I tried to convert this model using ONNX2keras also, I'm getting this error:
File "/vits/abcd.py", line 66, in
If anyone has tried to convert this model to TFLite please help.
Did you manage anything?
Did you manage anything?
No, it is still open