GigaAM CUDA out of memory

CUDA out of memory

Open at2me opened this issue 10 months ago • 2 comments

Прекрасно справилось с маленьким файлом (60Кб), но возникла проблема при распознавании речи в файле размером 2.1Мб:

(venv) root@dk04:~/GigaAM# file /mnt/rec/0b5ef5be-0925-4462-8f4e-cecab7f4d572.wav
/mnt/rec/0b5ef5be-0925-4462-8f4e-cecab7f4d572.wav: RIFF (little-endian) data, WAVE audio, Microsoft PCM, 16 bit, mono 8000 Hz

(venv) root@dk04:~/GigaAM# du -sh /mnt/rec/0b5ef5be-0925-4462-8f4e-cecab7f4d572.wav
2.1M  /mnt/rec/0b5ef5be-0925-4462-8f4e-cecab7f4d572.wav

(venv) root@dk04:~/GigaAM# python3 ctc_inference.py --model_config ./data/ctc_model_config.yaml --model_weights ./data/ctc_model_weights.ckpt --device cuda --audio_path /mnt/rec/0b5ef5be-0925-4462-8f4e-cecab7f4d572.wav
[NeMo W 2024-04-12 07:15:13 nemo_logging:393] /GigaAM/venv/lib/python3.11/site-packages/transformers/utils/generic.py:311: UserWarning: torch.utils._pytree._register_pytree_node is deprecated. Please use torch.utils._pytree.register_pytree_node instead.
      torch.utils._pytree._register_pytree_node(
    
[NeMo W 2024-04-12 07:15:13 nemo_logging:393] Could not load dataset as `manifest_filepath` was None. Provided config : {'batch_size': 10, 'trim_silence': False, 'max_duration': 25.0, 'min_duration': 0.1, 'shuffle': True, 'is_tarred': False, 'num_workers': 8, 'pin_memory': True, 'manifest_filepath': None, 'labels': [' ', 'а', 'б', 'в', 'г', 'д', 'е', 'ж', 'з', 'и', 'й', 'к', 'л', 'м', 'н', 'о', 'п', 'р', 'с', 'т', 'у', 'ф', 'х', 'ц', 'ч', 'ш', 'щ', 'ъ', 'ы', 'ь', 'э', 'ю', 'я'], 'sample_rate': 16000}
[NeMo W 2024-04-12 07:15:14 nemo_logging:393] Could not load dataset as `manifest_filepath` was None. Provided config : {'batch_size': 20, 'shuffle': False, 'num_workers': 4, 'min_duration': 0.1, 'pin_memory': True, 'manifest_filepath': None, 'labels': [' ', 'а', 'б', 'в', 'г', 'д', 'е', 'ж', 'з', 'и', 'й', 'к', 'л', 'м', 'н', 'о', 'п', 'р', 'с', 'т', 'у', 'ф', 'х', 'ц', 'ч', 'ш', 'щ', 'ъ', 'ы', 'ь', 'э', 'ю', 'я'], 'sample_rate': 16000}
[NeMo W 2024-04-12 07:15:14 nemo_logging:393] Could not load dataset as `manifest_filepath` was None. Provided config : {'manifest_filepath': None, 'batch_size': 100, 'shuffle': False, 'num_workers': 4, 'pin_memory': True, 'labels': [' ', 'а', 'б', 'в', 'г', 'д', 'е', 'ж', 'з', 'и', 'й', 'к', 'л', 'м', 'н', 'о', 'п', 'р', 'с', 'т', 'у', 'ф', 'х', 'ц', 'ч', 'ш', 'щ', 'ъ', 'ы', 'ь', 'э', 'ю', 'я'], 'sample_rate': 16000}
[NeMo I 2024-04-12 07:15:14 nemo_logging:381] PADDING: 0
Transcribing:   0%|                                                                                                                                                                                                | 0/1 [00:00<?, ?it/s]
Traceback (most recent call last):
  File "/GigaAM/ctc_inference.py", line 76, in <module>
    main(
  File "/GigaAM/ctc_inference.py", line 70, in main
    transcription = model.transcribe([audio_path])[0]
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/GigaAM/venv/lib/python3.11/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/GigaAM/venv/lib/python3.11/site-packages/nemo/collections/asr/models/ctc_models.py", line 198, in transcribe
    logits, logits_len, greedy_predictions = self.forward(
                                             ^^^^^^^^^^^^^
  File "/GigaAM/venv/lib/python3.11/site-packages/nemo/core/classes/common.py", line 1087, in __call__
    outputs = wrapped(*args, **kwargs)
              ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/GigaAM/venv/lib/python3.11/site-packages/nemo/collections/asr/models/ctc_models.py", line 543, in forward
    encoder_output = self.encoder(audio_signal=processed_signal, length=processed_signal_length)
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/GigaAM/venv/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/GigaAM/venv/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/GigaAM/venv/lib/python3.11/site-packages/nemo/core/classes/common.py", line 1087, in __call__
    outputs = wrapped(*args, **kwargs)
              ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/GigaAM/venv/lib/python3.11/site-packages/nemo/collections/asr/modules/conformer_encoder.py", line 491, in forward
    return self.forward_internal(
           ^^^^^^^^^^^^^^^^^^^^^^
  File "/GigaAM/venv/lib/python3.11/site-packages/nemo/collections/asr/modules/conformer_encoder.py", line 571, in forward_internal
    audio_signal = layer(
                   ^^^^^^
  File "/GigaAM/venv/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/GigaAM/venv/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/GigaAM/venv/lib/python3.11/site-packages/nemo/collections/asr/parts/submodules/conformer_modules.py", line 162, in forward
    x = self.self_attn(query=x, key=x, value=x, mask=att_mask, pos_emb=pos_emb, cache=cache_last_channel)
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/GigaAM/venv/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/GigaAM/venv/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/GigaAM/venv/lib/python3.11/site-packages/nemo/collections/asr/parts/submodules/multi_head_attention.py", line 238, in forward
    matrix_bd = torch.matmul(q_with_bias_v, p.transpose(-2, -1))
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 1.35 GiB. GPU 0 has a total capacity of 2.93 GiB of which 981.75 MiB is free. Including non-PyTorch memory, this process has 1.97 GiB memory in use. Of the allocated memory 1.78 GiB is allocated by PyTorch, and 104.70 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation.  See documentation for Memory Management  (https://pytorch.org/docs/stable/notes/cuda.html#environment-variables)

С маленькими файлами проблем не возникает:

(venv) root@dk04:~/GigaAM# file /mnt/rec/0121d02a-d33e-43dd-ab2e-eb7067d0fac7.wav
/mnt/rec/0121d02a-d33e-43dd-ab2e-eb7067d0fac7.wav: RIFF (little-endian) data, WAVE audio, Microsoft PCM, 16 bit, mono 8000 Hz

(venv) root@dk04:~/GigaAM# du -sh /mnt/rec/0121d02a-d33e-43dd-ab2e-eb7067d0fac7.wav
60K /mnt/rec/0121d02a-d33e-43dd-ab2e-eb7067d0fac7.wav

(venv) root@dk04:~/GigaAM# python3 ctc_inference.py --model_config ./data/ctc_model_config.yaml --model_weights ./data/ctc_model_weights.ckpt --device cuda --audio_path /mnt/rec/0121d02a-d33e-43dd-ab2e-eb7067d0fac7.wav
[NeMo W 2024-04-12 07:17:38 nemo_logging:393] /GigaAM/venv/lib/python3.11/site-packages/transformers/utils/generic.py:311: UserWarning: torch.utils._pytree._register_pytree_node is deprecated. Please use torch.utils._pytree.register_pytree_node instead.
      torch.utils._pytree._register_pytree_node(
    
[NeMo W 2024-04-12 07:17:38 nemo_logging:393] Could not load dataset as `manifest_filepath` was None. Provided config : {'batch_size': 10, 'trim_silence': False, 'max_duration': 25.0, 'min_duration': 0.1, 'shuffle': True, 'is_tarred': False, 'num_workers': 8, 'pin_memory': True, 'manifest_filepath': None, 'labels': [' ', 'а', 'б', 'в', 'г', 'д', 'е', 'ж', 'з', 'и', 'й', 'к', 'л', 'м', 'н', 'о', 'п', 'р', 'с', 'т', 'у', 'ф', 'х', 'ц', 'ч', 'ш', 'щ', 'ъ', 'ы', 'ь', 'э', 'ю', 'я'], 'sample_rate': 16000}
[NeMo W 2024-04-12 07:17:38 nemo_logging:393] Could not load dataset as `manifest_filepath` was None. Provided config : {'batch_size': 20, 'shuffle': False, 'num_workers': 4, 'min_duration': 0.1, 'pin_memory': True, 'manifest_filepath': None, 'labels': [' ', 'а', 'б', 'в', 'г', 'д', 'е', 'ж', 'з', 'и', 'й', 'к', 'л', 'м', 'н', 'о', 'п', 'р', 'с', 'т', 'у', 'ф', 'х', 'ц', 'ч', 'ш', 'щ', 'ъ', 'ы', 'ь', 'э', 'ю', 'я'], 'sample_rate': 16000}
[NeMo W 2024-04-12 07:17:38 nemo_logging:393] Could not load dataset as `manifest_filepath` was None. Provided config : {'manifest_filepath': None, 'batch_size': 100, 'shuffle': False, 'num_workers': 4, 'pin_memory': True, 'labels': [' ', 'а', 'б', 'в', 'г', 'д', 'е', 'ж', 'з', 'и', 'й', 'к', 'л', 'м', 'н', 'о', 'п', 'р', 'с', 'т', 'у', 'ф', 'х', 'ц', 'ч', 'ш', 'щ', 'ъ', 'ы', 'ь', 'э', 'ю', 'я'], 'sample_rate': 16000}
[NeMo I 2024-04-12 07:17:38 nemo_logging:381] PADDING: 0
Transcribing: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00,  1.90it/s]
transcription: нету нету алло

С cpu так же проблем не возникало. Пробую на видеокарте:

# nvidia-smi
Fri Apr 12 07:10:44 2024       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.54.15              Driver Version: 550.54.15      CUDA Version: 12.4     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA GeForce GTX 1060 3GB    On  |   00000000:01:00.0  On |                  N/A |
|  0%   42C    P8              7W /  120W |       1MiB /   3072MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
                                                                                         
+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI        PID   Type   Process name                              GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|  No running processes found                                                             |
+-----------------------------------------------------------------------------------------+

Подскажите пожалуйста, может быть какие-то параметры подкорректировать? График момента проблемы

root@dk04:~# nvtop
Device 0 [NVIDIA GeForce GTX 1060 3GB] PCIe GEN 1@16x RX: 1.000 KiB/s TX: 0.000 KiB/s
 GPU 139MHz  MEM 405MHz  TEMP  43°C FAN   0% POW   7 / 120 W
 GPU[                                   0%] MEM[|                     0.068Gi/3.000Gi]
   ┌──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
100│GPU0 %                                                                                                                    │
   │GPU0 mem%                                                                                                                 │
   │                                                                                                                          │
   │                                                                                                                          │
   │                                                                                                                          │
 75│                                                                                                                          │
   │                                                                                             ┌─┐                          │
   │                                                                                             │ │                          │
   │                                                                                             │ │                          │
   │                                                                                             │ │                          │
 50│                                                                                             │ │                          │
   │                                                                                             │ │                          │
   │                                                                                           ┌─┘ │                          │
   │                                                                                           │   │                          │
   │                                                                                           │   │                          │
 25│                                                                                          ┌┼──┐│                          │
   │                                                                                          ││  ││                          │
   │                                                                                          ││  ││                          │
   │                                                                                          ││  ││                          │
   │                                                                                          ││  ││                          │
  0│──────────────────────────────────────────────────────────────────────────────────────────┴┘  └┴──────────────────────────│
   └61s──────────────────────────45s────────────────────────────30s───────────────────────────15s───────────────────────────0s┘
    PID USER DEV     TYPE  GPU        GPU MEM    CPU  HOST MEM Command

Apr 12 '24 12:04 at2me

GigaAM GigaAM copied to clipboard

CUDA out of memory

GigaAM
GigaAM copied to clipboard