NeMo icon indicating copy to clipboard operation
NeMo copied to clipboard

scripts/export.py fails with `--device=cpu`

Open itzsimpl opened this issue 2 years ago • 4 comments

Regardless of the model being exported the script fails with

  File "scripts/export.py", line 160, in nemo_export
    raise e
  File "scripts/export.py", line 139, in nemo_export
    output_example = forward_method(model)(*input_list, **input_dict)
  File "/data/nemo_main/nemo/collections/tts/models/hifigan.py", line 425, in forward_for_export
    return self.generator(x=spec)
  File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1111, in _call_impl
    return forward_call(*input, **kwargs)
  File "/data/nemo_main/nemo/collections/tts/modules/hifigan_modules.py", line 245, in forward
    x = self.conv_pre(x)
  File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1118, in _call_impl
    result = hook(self, input)
  File "/opt/conda/lib/python3.8/site-packages/torch/nn/utils/weight_norm.py", line 64, in __call__
    setattr(module, self.name, self.compute_weight(module))
  File "/opt/conda/lib/python3.8/site-packages/torch/nn/utils/weight_norm.py", line 24, in compute_weight
    return _weight_norm(v, g, self.dim)
RuntimeError: Inference tensors do not track version counter.

Tested with nemo:1.9.0rc0 based on pytorch:22.04-py3. Related to #4100

itzsimpl avatar May 04 '22 12:05 itzsimpl

@borisfom

titu1994 avatar Jun 04 '22 22:06 titu1994

@borisfom please check

titu1994 avatar Jul 23 '22 19:07 titu1994

Yes I can repro it with latest Nemo - though, with changed export script, in different place, during actual export() call. That error usually means that some training-only code gets in the path. This may be Torch internal module (weight_norm) fault - need to investigate.

borisfom avatar Aug 26 '22 02:08 borisfom

@borisfom you may wish to check if a solution similar to https://github.com/NVIDIA/NeMo/pull/4106 would help.

itzsimpl avatar Aug 26 '22 10:08 itzsimpl

This issue is stale because it has been open for 30 days with no activity. Remove stale label or comment or this will be closed in 7 days.

github-actions[bot] avatar Oct 08 '22 02:10 github-actions[bot]

This issue was closed because it has been inactive for 7 days since being marked as stale.

github-actions[bot] avatar Oct 15 '22 02:10 github-actions[bot]