silero-models Bug report - [RunTime Error with latest Jit Spanish Model]

🐛 Bug

When following the examples of the colab_examples notebook, in the PyTorch Example/More Examples section, if use the latest Spanish jit model it gives the following Runtime error:

RuntimeError: stft requires the return_complex parameter be given for real inputs, and will further require that return_complex=True in a future PyTorch release.

To Reproduce

Steps to reproduce the behavior:

Open the Colab_examples.ipynb
In the Pytorch Example/More Example sections, in the corresponding cell when loading the decoder and the model, change the English model to Spanish:

# model, decoder = init_jit_model(models.stt_models.en.latest.jit, device=device)
model, decoder = init_jit_model(models.stt_models.es.latest.jit, device=device)

Keep running the notebook two more cells until the loop where the model is called. There is where the error shows up:

RuntimeError                              Traceback (most recent call last)
[<ipython-input-29-2c955b63e0bf>](https://localhost:8080/#) in <cell line: 4>()
      2 input = prepare_model_input(read_batch(random.sample(batches, k=1)[0]),
      3                             device=device)
----> 4 output = model(input)
      5 for example in output:
      6     print(decoder(example.cpu()))

1 frames
[/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py](https://localhost:8080/#) in _call_impl(self, *args, **kwargs)
   1525                 or _global_backward_pre_hooks or _global_backward_hooks
   1526                 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1527             return forward_call(*args, **kwargs)
   1528 
   1529         try:

RuntimeError: The following operation failed in the TorchScript interpreter.
Traceback of TorchScript, serialized code (most recent call last):
  File "code/__torch__/stt_pretrained/models/model.py", line 42, in forward
    _4 = self.win_length
    _5 = torch.hann_window(self.n_fft, dtype=ops.prim.dtype(x), layout=None, device=ops.prim.device(x), pin_memory=None)
    x0 = __torch__.torch.functional.stft(x, _2, _3, _4, _5, True, "reflect", False, True, )
         ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE
    _6 = torch.slice(x0, 0, 0, 9223372036854775807, 1)
    _7 = torch.slice(_6, 1, 0, 9223372036854775807, 1)
  File "code/__torch__/torch/functional.py", line 20, in stft
  else:
    input0 = input
  _2 = torch.stft(input0, n_fft, hop_length, win_length, window, normalized, onesided)
       ~~~~~~~~~~ <--- HERE
  return _2

Traceback of TorchScript, original code (most recent call last):
  File "/opt/conda/lib/python3.7/site-packages/torch/functional.py", line 465, in stft
        input = F.pad(input.view(extended_shape), (pad, pad), pad_mode)
        input = input.view(input.shape[-signal_dim:])
    return _VF.stft(input, n_fft, hop_length, win_length, window, normalized, onesided)
           ~~~~~~~~ <--- HERE
RuntimeError: stft requires the return_complex parameter be given for real inputs, and will further require that return_complex=True in a future PyTorch release.

Expected behavior

The audio file should be transcribed to text

Environment

The environment of the colab_example notebook itself:

ollecting environment information... PyTorch version: 2.1.0+cu118 Is debug build: False CUDA used to build PyTorch: 11.8 ROCM used to build PyTorch: N/A

OS: Ubuntu 22.04.3 LTS (x86_64) GCC version: (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0 Clang version: 14.0.0-1ubuntu1.1 CMake version: version 3.27.7 Libc version: glibc-2.35

Python version: 3.10.12 (main, Nov 20 2023, 15:14:05) [GCC 11.4.0] (64-bit runtime) Python platform: Linux-5.15.120+-x86_64-with-glibc2.35 Is CUDA available: False CUDA runtime version: 11.8.89 CUDA_MODULE_LOADING set to: N/A GPU models and configuration: Could not collect Nvidia driver version: Could not collect cuDNN version: Probably one of the following: /usr/lib/x86_64-linux-gnu/libcudnn.so.8.9.6 /usr/lib/x86_64-linux-gnu/libcudnn_adv_infer.so.8.9.6 /usr/lib/x86_64-linux-gnu/libcudnn_adv_train.so.8.9.6 /usr/lib/x86_64-linux-gnu/libcudnn_cnn_infer.so.8.9.6 /usr/lib/x86_64-linux-gnu/libcudnn_cnn_train.so.8.9.6 /usr/lib/x86_64-linux-gnu/libcudnn_ops_infer.so.8.9.6 /usr/lib/x86_64-linux-gnu/libcudnn_ops_train.so.8.9.6 HIP runtime version: N/A MIOpen runtime version: N/A Is XNNPACK available: True

CPU: Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Address sizes: 46 bits physical, 48 bits virtual Byte Order: Little Endian CPU(s): 2 On-line CPU(s) list: 0,1 Vendor ID: GenuineIntel Model name: Intel(R) Xeon(R) CPU @ 2.20GHz CPU family: 6 Model: 79 Thread(s) per core: 2 Core(s) per socket: 1 Socket(s): 1 Stepping: 0 BogoMIPS: 4399.99 Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl xtopology nonstop_tsc cpuid tsc_known_freq pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch invpcid_single ssbd ibrs ibpb stibp fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm rdseed adx smap xsaveopt arat md_clear arch_capabilities Hypervisor vendor: KVM Virtualization type: full L1d cache: 32 KiB (1 instance) L1i cache: 32 KiB (1 instance) L2 cache: 256 KiB (1 instance) L3 cache: 55 MiB (1 instance) NUMA node(s): 1 NUMA node0 CPU(s): 0,1 Vulnerability Itlb multihit: Not affected Vulnerability L1tf: Mitigation; PTE Inversion Vulnerability Mds: Vulnerable; SMT Host state unknown Vulnerability Meltdown: Vulnerable Vulnerability Mmio stale data: Vulnerable Vulnerability Retbleed: Vulnerable Vulnerability Spec store bypass: Vulnerable Vulnerability Spectre v1: Vulnerable: __user pointer sanitization and usercopy barriers only; no swapgs barriers Vulnerability Spectre v2: Vulnerable, IBPB: disabled, STIBP: disabled, PBRSB-eIBRS: Not affected Vulnerability Srbds: Not affected Vulnerability Tsx async abort: Vulnerable

Versions of relevant libraries: [pip3] numpy==1.23.5 [pip3] torch==2.1.0+cu118 [pip3] torchaudio==2.1.0+cu118 [pip3] torchdata==0.7.0 [pip3] torchsummary==1.5.1 [pip3] torchtext==0.16.0 [pip3] torchvision==0.16.0+cu118 [pip3] triton==2.1.0 [conda] Could not collect

Dec 06 '23 12:12 basillicus

Looks like the Spanish model is too old.

Dec 06 '23 13:12 snakers4

Greetings. I'm facing the same problem here. I've managed to make the onnx Spanish model work, but I'd like to know if there's any way to use the jit model as it is now. Is there any previous version of torch that might be able to do the trick to actually run it? Thanks in advance for any response you might be able to provide.

Feb 20 '24 16:02 amda-phd

silero-models silero-models copied to clipboard

Bug report - [RunTime Error with latest Jit Spanish Model]

🐛 Bug

To Reproduce

Expected behavior

Environment

silero-models
silero-models copied to clipboard