NeMo Runtime error when exporting FastPitch model to ONNX

I tried to export a FastPitch model I trained to ONNX with export.py. The command I used was: python scripts/export.py "/home/xxx/TTSs/Nemo-models/nancy_fastpitch-44k-new-v3.nemo" "nancy_fastpitch-44k-new-v3.onnx" --runtime-check --device="cpu" --autocast But it produced an error:

Traceback (most recent call last):
  File "scripts/export.py", line 169, in <module>
    nemo_export(sys.argv[1:])
  File "scripts/export.py", line 158, in nemo_export
    raise e
  File "scripts/export.py", line 139, in nemo_export
    output_example = forward_method(model)(*input_list, **input_dict)
  File "/mnt/d/TTSs/NeMo/nemo/collections/tts/models/fastpitch.py", line 576, in forward_for_export
    return self.fastpitch.infer(text=text, pitch=pitch, pace=pace, speaker=speaker)
  File "/mnt/d/TTSs/NeMo/nemo/collections/tts/modules/fastpitch.py", line 288, in infer
    enc_out, enc_mask = self.encoder(input=text, conditioning=spk_emb)
  File "/home/tony/Envs/nemo/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
    return forward_call(*input, **kwargs)
  File "/mnt/d/TTSs/NeMo/nemo/collections/tts/modules/transformer.py", line 262, in forward
    return self._forward(self.word_emb(input), (input != self.padding_idx).unsqueeze(2), conditioning)  # (B, L, 1)
  File "/home/tony/Envs/nemo/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/tony/Envs/nemo/lib/python3.8/site-packages/torch/nn/modules/sparse.py", line 158, in forward
    return F.embedding(
  File "/home/tony/Envs/nemo/lib/python3.8/site-packages/torch/nn/functional.py", line 2044, in embedding
    return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse)
RuntimeError: Inference tensors do not track version counter.

when it was Getting output example.

What am I missing? Thank you!

May 03 '22 04:05 godspirit00

There seems to be a couple of issues with fastpitch.py that are preventing a successful export. See related issue #4012. You can try the following if it will work in your case:

change https://github.com/NVIDIA/NeMo/blob/adde38efb3ea96fd96991532bddf732d5c1a190a/scripts/export.py#L141-L142 to

if args.runtime_check:
    input_names = model.input_names 
    output_names = model.output_names

and then execute the command as

python scripts/export.py "/home/xxx/TTSs/Nemo-models/nancy_fastpitch-44k-new-v3.nemo" "nancy_fastpitch-44k-new-v3.onnx" --autocast --max-dim=44

Explanation why this should work:

removing --device="cpu" will avoid your error,
adding --max-dim=44 will avoid the RuntimeError: CUDA out of memory
removing --runtime-check will skip retrieving input_names and output_names from the model, as fastpitch initialises them only when _prepare_for_export is called https://github.com/NVIDIA/NeMo/blob/f45f56bb3730939f43ef2a8656bf8075d615f361/nemo/collections/tts/models/fastpitch.py#L515-L531 this is done internally in export() https://github.com/NVIDIA/NeMo/blob/f45f56bb3730939f43ef2a8656bf8075d615f361/nemo/core/classes/exportable.py#L102-L104 but not when the model is initialized https://github.com/NVIDIA/NeMo/blob/f45f56bb3730939f43ef2a8656bf8075d615f361/nemo/collections/tts/models/fastpitch.py#L154 hence if the parameter is left, the conversion will fail with

Traceback (most recent call last):
  File "scripts/export.py", line 170, in <module>
    nemo_export(sys.argv[1:])
  File "scripts/export.py", line 159, in nemo_export
    raise e
  File "scripts/export.py", line 142, in nemo_export
    input_names = model.input_names
  File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1186, in __getattr__
    raise AttributeError("'{}' object has no attribute '{}'".format(
AttributeError: 'FastPitchModel' object has no attribute 'input_names'

May 03 '22 19:05 itzsimpl

@itzsimpl Thank you for your help. The export was successful.

May 05 '22 03:05 godspirit00

@itzsimpl I encountered a similar situation: CODE:

from nemo.collections.tts.models import FastPitchModel
spec_generator = FastPitchModel.from_pretrained("tts_en_fastpitch")
spec_generator.export("fastpitch.onnx")

LOG:

[NeMo W 2022-05-07 13:07:20 optimizers:55] Apex was not found. Using the lamb or fused_adam optimizer will error out.
[NeMo W 2022-05-07 13:07:21 experimental:27] Module <class 'nemo.collections.nlp.data.language_modeling.megatron.megatron_batch_samplers.MegatronPretrainingRandomBatchSampler'> is experimental, not ready for production and is not fully supported. Use at your own risk.
[NeMo I 2022-05-07 13:07:21 cloud:56] Found existing object /root/.cache/torch/NeMo/NeMo_1.8.2/tts_en_fastpitch_align/26d7e09971f1d611e24df90c7a9d9b38/tts_en_fastpitch_align.nemo.
[NeMo I 2022-05-07 13:07:21 cloud:62] Re-using file from: /root/.cache/torch/NeMo/NeMo_1.8.2/tts_en_fastpitch_align/26d7e09971f1d611e24df90c7a9d9b38/tts_en_fastpitch_align.nemo
[NeMo I 2022-05-07 13:07:21 common:747] Instantiating model from pre-trained checkpoint
[NeMo I 2022-05-07 13:07:21 tokenize_and_classify:88] Creating ClassifyFst grammars.
[NeMo W 2022-05-07 13:07:28 g2ps:84] apply_to_oov_word=None, it means that some of words will remain unchanged if they are not handled by one of rule in self.parse_one_word(). It is useful when you use tokenizer with set of phonemes and chars together, otherwise it can be not.
[NeMo W 2022-05-07 13:07:28 modelPT:148] If you intend to do training or fine-tuning, please call the ModelPT.setup_training_data() method and provide a valid configuration file to setup the train data loader.
    Train config : 
    dataset:
      _target_: nemo.collections.tts.torch.data.TTSDataset
      manifest_filepath: /ws/LJSpeech/nvidia_ljspeech_train_clean_ngc.json
      sample_rate: 22050
      sup_data_path: /raid/LJSpeech/supplementary
      sup_data_types:
      - align_prior_matrix
      - pitch
      n_fft: 1024
      win_length: 1024
      hop_length: 256
      window: hann
      n_mels: 80
      lowfreq: 0
      highfreq: 8000
      max_duration: null
      min_duration: 0.1
      ignore_file: null
      trim: false
      pitch_fmin: 65.40639132514966
      pitch_fmax: 2093.004522404789
      pitch_norm: true
      pitch_mean: 212.35873413085938
      pitch_std: 68.52806091308594
      use_beta_binomial_interpolator: true
    dataloader_params:
      drop_last: false
      shuffle: true
      batch_size: 24
      num_workers: 0
    
[NeMo W 2022-05-07 13:07:28 modelPT:155] If you intend to do validation, please call the ModelPT.setup_validation_data() or ModelPT.setup_multiple_validation_data() method and provide a valid configuration file to setup the validation data loader(s). 
    Validation config : 
    dataset:
      _target_: nemo.collections.tts.torch.data.TTSDataset
      manifest_filepath: /ws/LJSpeech/nvidia_ljspeech_val_clean_ngc.json
      sample_rate: 22050
      sup_data_path: /raid/LJSpeech/supplementary
      sup_data_types:
      - align_prior_matrix
      - pitch
      n_fft: 1024
      win_length: 1024
      hop_length: 256
      window: hann
      n_mels: 80
      lowfreq: 0
      highfreq: 8000
      max_duration: null
      min_duration: null
      ignore_file: null
      trim: false
      pitch_fmin: 65.40639132514966
      pitch_fmax: 2093.004522404789
      pitch_norm: true
      pitch_mean: 212.35873413085938
      pitch_std: 68.52806091308594
      use_beta_binomial_interpolator: true
    dataloader_params:
      drop_last: false
      shuffle: false
      batch_size: 24
      num_workers: 0
    
[NeMo I 2022-05-07 13:07:28 features:259] PADDING: 1
[NeMo I 2022-05-07 13:07:28 features:276] STFT using torch
[NeMo I 2022-05-07 13:07:32 save_restore_connector:209] Model FastPitchModel was successfully restored from /root/.cache/torch/NeMo/NeMo_1.8.2/tts_en_fastpitch_align/26d7e09971f1d611e24df90c7a9d9b38/tts_en_fastpitch_align.nemo.
[NeMo I 2022-05-07 13:07:32 export_utils:261] Swapped 28 modules
Traceback (most recent call last):
  File "./gen.py", line 12, in <module>
    spec_generator.export("fastpitch.onnx")
  File "/usr/local/lib/python3.8/dist-packages/nemo/core/classes/exportable.py", line 109, in export
    output_example = tuple(self.forward(*input_list, **input_dict))
  File "/usr/local/lib/python3.8/dist-packages/nemo/collections/tts/models/fastpitch.py", line 552, in forward_for_export
    return self.fastpitch.infer(text=text, pitch=pitch, pace=pace, speaker=speaker)
  File "/usr/local/lib/python3.8/dist-packages/nemo/collections/tts/modules/fastpitch.py", line 303, in infer
    dec_out, _ = self.decoder(input=len_regulated, seq_lens=dec_lens)
  File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py", line 1110, in _call_impl
    return forward_call(*input, **kwargs)
  File "/usr/local/lib/python3.8/dist-packages/nemo/collections/tts/modules/transformer.py", line 215, in forward
    return self._forward(input, mask_from_lens(seq_lens).unsqueeze(2), conditioning)
  File "/usr/local/lib/python3.8/dist-packages/nemo/collections/tts/modules/transformer.py", line 223, in _forward
    out = layer(out, mask=mask)
  File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py", line 1110, in _call_impl
    return forward_call(*input, **kwargs)
  File "/usr/local/lib/python3.8/dist-packages/nemo/collections/tts/modules/transformer.py", line 171, in forward
    output = self.dec_attn(dec_inp, attn_mask=~mask.squeeze(2))
  File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py", line 1110, in _call_impl
    return forward_call(*input, **kwargs)
  File "/usr/local/lib/python3.8/dist-packages/nemo/collections/tts/modules/transformer.py", line 113, in forward
    return self._forward(inp, attn_mask)
  File "/usr/local/lib/python3.8/dist-packages/nemo/collections/tts/modules/transformer.py", line 139, in _forward
    attn_mask = attn_mask.repeat(n_head, attn_mask.size(2), 1)
RuntimeError: CUDA out of memory. Tried to allocate 7.21 GiB (GPU 0; 14.76 GiB total capacity; 7.77 GiB already allocated; 5.87 GiB free; 7.92 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

ENV:

GPU: Tesla T4 15GB
CUDA Version: 11.5
nemo-toolkit: 1.8.2
torch: 1.11.0

May 07 '22 05:05 uaex

@borisfom please review pr

Jul 23 '22 19:07 titu1994

This issue is stale because it has been open for 30 days with no activity. Remove stale label or comment or this will be closed in 7 days.

Oct 08 '22 02:10 github-actions[bot]

This issue is stale because it has been open for 30 days with no activity. Remove stale label or comment or this will be closed in 7 days.

Nov 08 '22 02:11 github-actions[bot]

This issue was closed because it has been inactive for 7 days since being marked as stale.

Nov 16 '22 02:11 github-actions[bot]

NeMo NeMo copied to clipboard

Runtime error when exporting FastPitch model to ONNX

NeMo
NeMo copied to clipboard