silero-models icon indicating copy to clipboard operation
silero-models copied to clipboard

Bug report - [RunTime Error with latest Jit Spanish Model]

Open basillicus opened this issue 6 months ago • 2 comments

🐛 Bug

When following the examples of the colab_examples notebook, in the PyTorch Example/More Examples section, if use the latest Spanish jit model it gives the following Runtime error:

RuntimeError: stft requires the return_complex parameter be given for real inputs, and will further require that return_complex=True in a future PyTorch release.

To Reproduce

Steps to reproduce the behavior:

  1. Open the Colab_examples.ipynb
  2. In the Pytorch Example/More Example sections, in the corresponding cell when loading the decoder and the model, change the English model to Spanish:
# model, decoder = init_jit_model(models.stt_models.en.latest.jit, device=device)
model, decoder = init_jit_model(models.stt_models.es.latest.jit, device=device)
  1. Keep running the notebook two more cells until the loop where the model is called. There is where the error shows up:
RuntimeError                              Traceback (most recent call last)
[<ipython-input-29-2c955b63e0bf>](https://localhost:8080/#) in <cell line: 4>()
      2 input = prepare_model_input(read_batch(random.sample(batches, k=1)[0]),
      3                             device=device)
----> 4 output = model(input)
      5 for example in output:
      6     print(decoder(example.cpu()))

1 frames
[/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py](https://localhost:8080/#) in _call_impl(self, *args, **kwargs)
   1525                 or _global_backward_pre_hooks or _global_backward_hooks
   1526                 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1527             return forward_call(*args, **kwargs)
   1528 
   1529         try:

RuntimeError: The following operation failed in the TorchScript interpreter.
Traceback of TorchScript, serialized code (most recent call last):
  File "code/__torch__/stt_pretrained/models/model.py", line 42, in forward
    _4 = self.win_length
    _5 = torch.hann_window(self.n_fft, dtype=ops.prim.dtype(x), layout=None, device=ops.prim.device(x), pin_memory=None)
    x0 = __torch__.torch.functional.stft(x, _2, _3, _4, _5, True, "reflect", False, True, )
         ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE
    _6 = torch.slice(x0, 0, 0, 9223372036854775807, 1)
    _7 = torch.slice(_6, 1, 0, 9223372036854775807, 1)
  File "code/__torch__/torch/functional.py", line 20, in stft
  else:
    input0 = input
  _2 = torch.stft(input0, n_fft, hop_length, win_length, window, normalized, onesided)
       ~~~~~~~~~~ <--- HERE
  return _2

Traceback of TorchScript, original code (most recent call last):
  File "/opt/conda/lib/python3.7/site-packages/torch/functional.py", line 465, in stft
        input = F.pad(input.view(extended_shape), (pad, pad), pad_mode)
        input = input.view(input.shape[-signal_dim:])
    return _VF.stft(input, n_fft, hop_length, win_length, window, normalized, onesided)
           ~~~~~~~~ <--- HERE
RuntimeError: stft requires the return_complex parameter be given for real inputs, and will further require that return_complex=True in a future PyTorch release.

Expected behavior

The audio file should be transcribed to text

Environment

The environment of the colab_example notebook itself:

ollecting environment information... PyTorch version: 2.1.0+cu118 Is debug build: False CUDA used to build PyTorch: 11.8 ROCM used to build PyTorch: N/A

OS: Ubuntu 22.04.3 LTS (x86_64) GCC version: (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0 Clang version: 14.0.0-1ubuntu1.1 CMake version: version 3.27.7 Libc version: glibc-2.35

Python version: 3.10.12 (main, Nov 20 2023, 15:14:05) [GCC 11.4.0] (64-bit runtime) Python platform: Linux-5.15.120+-x86_64-with-glibc2.35 Is CUDA available: False CUDA runtime version: 11.8.89 CUDA_MODULE_LOADING set to: N/A GPU models and configuration: Could not collect Nvidia driver version: Could not collect cuDNN version: Probably one of the following: /usr/lib/x86_64-linux-gnu/libcudnn.so.8.9.6 /usr/lib/x86_64-linux-gnu/libcudnn_adv_infer.so.8.9.6 /usr/lib/x86_64-linux-gnu/libcudnn_adv_train.so.8.9.6 /usr/lib/x86_64-linux-gnu/libcudnn_cnn_infer.so.8.9.6 /usr/lib/x86_64-linux-gnu/libcudnn_cnn_train.so.8.9.6 /usr/lib/x86_64-linux-gnu/libcudnn_ops_infer.so.8.9.6 /usr/lib/x86_64-linux-gnu/libcudnn_ops_train.so.8.9.6 HIP runtime version: N/A MIOpen runtime version: N/A Is XNNPACK available: True

CPU: Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Address sizes: 46 bits physical, 48 bits virtual Byte Order: Little Endian CPU(s): 2 On-line CPU(s) list: 0,1 Vendor ID: GenuineIntel Model name: Intel(R) Xeon(R) CPU @ 2.20GHz CPU family: 6 Model: 79 Thread(s) per core: 2 Core(s) per socket: 1 Socket(s): 1 Stepping: 0 BogoMIPS: 4399.99 Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl xtopology nonstop_tsc cpuid tsc_known_freq pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch invpcid_single ssbd ibrs ibpb stibp fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm rdseed adx smap xsaveopt arat md_clear arch_capabilities Hypervisor vendor: KVM Virtualization type: full L1d cache: 32 KiB (1 instance) L1i cache: 32 KiB (1 instance) L2 cache: 256 KiB (1 instance) L3 cache: 55 MiB (1 instance) NUMA node(s): 1 NUMA node0 CPU(s): 0,1 Vulnerability Itlb multihit: Not affected Vulnerability L1tf: Mitigation; PTE Inversion Vulnerability Mds: Vulnerable; SMT Host state unknown Vulnerability Meltdown: Vulnerable Vulnerability Mmio stale data: Vulnerable Vulnerability Retbleed: Vulnerable Vulnerability Spec store bypass: Vulnerable Vulnerability Spectre v1: Vulnerable: __user pointer sanitization and usercopy barriers only; no swapgs barriers Vulnerability Spectre v2: Vulnerable, IBPB: disabled, STIBP: disabled, PBRSB-eIBRS: Not affected Vulnerability Srbds: Not affected Vulnerability Tsx async abort: Vulnerable

Versions of relevant libraries: [pip3] numpy==1.23.5 [pip3] torch==2.1.0+cu118 [pip3] torchaudio==2.1.0+cu118 [pip3] torchdata==0.7.0 [pip3] torchsummary==1.5.1 [pip3] torchtext==0.16.0 [pip3] torchvision==0.16.0+cu118 [pip3] triton==2.1.0 [conda] Could not collect

basillicus avatar Dec 06 '23 12:12 basillicus