GigaAM
GigaAM copied to clipboard
CUDA out of memory
Прекрасно справилось с маленьким файлом (60Кб), но возникла проблема при распознавании речи в файле размером 2.1Мб:
(venv) root@dk04:~/GigaAM# file /mnt/rec/0b5ef5be-0925-4462-8f4e-cecab7f4d572.wav
/mnt/rec/0b5ef5be-0925-4462-8f4e-cecab7f4d572.wav: RIFF (little-endian) data, WAVE audio, Microsoft PCM, 16 bit, mono 8000 Hz
(venv) root@dk04:~/GigaAM# du -sh /mnt/rec/0b5ef5be-0925-4462-8f4e-cecab7f4d572.wav
2.1M /mnt/rec/0b5ef5be-0925-4462-8f4e-cecab7f4d572.wav
(venv) root@dk04:~/GigaAM# python3 ctc_inference.py --model_config ./data/ctc_model_config.yaml --model_weights ./data/ctc_model_weights.ckpt --device cuda --audio_path /mnt/rec/0b5ef5be-0925-4462-8f4e-cecab7f4d572.wav
[NeMo W 2024-04-12 07:15:13 nemo_logging:393] /GigaAM/venv/lib/python3.11/site-packages/transformers/utils/generic.py:311: UserWarning: torch.utils._pytree._register_pytree_node is deprecated. Please use torch.utils._pytree.register_pytree_node instead.
torch.utils._pytree._register_pytree_node(
[NeMo W 2024-04-12 07:15:13 nemo_logging:393] Could not load dataset as `manifest_filepath` was None. Provided config : {'batch_size': 10, 'trim_silence': False, 'max_duration': 25.0, 'min_duration': 0.1, 'shuffle': True, 'is_tarred': False, 'num_workers': 8, 'pin_memory': True, 'manifest_filepath': None, 'labels': [' ', 'а', 'б', 'в', 'г', 'д', 'е', 'ж', 'з', 'и', 'й', 'к', 'л', 'м', 'н', 'о', 'п', 'р', 'с', 'т', 'у', 'ф', 'х', 'ц', 'ч', 'ш', 'щ', 'ъ', 'ы', 'ь', 'э', 'ю', 'я'], 'sample_rate': 16000}
[NeMo W 2024-04-12 07:15:14 nemo_logging:393] Could not load dataset as `manifest_filepath` was None. Provided config : {'batch_size': 20, 'shuffle': False, 'num_workers': 4, 'min_duration': 0.1, 'pin_memory': True, 'manifest_filepath': None, 'labels': [' ', 'а', 'б', 'в', 'г', 'д', 'е', 'ж', 'з', 'и', 'й', 'к', 'л', 'м', 'н', 'о', 'п', 'р', 'с', 'т', 'у', 'ф', 'х', 'ц', 'ч', 'ш', 'щ', 'ъ', 'ы', 'ь', 'э', 'ю', 'я'], 'sample_rate': 16000}
[NeMo W 2024-04-12 07:15:14 nemo_logging:393] Could not load dataset as `manifest_filepath` was None. Provided config : {'manifest_filepath': None, 'batch_size': 100, 'shuffle': False, 'num_workers': 4, 'pin_memory': True, 'labels': [' ', 'а', 'б', 'в', 'г', 'д', 'е', 'ж', 'з', 'и', 'й', 'к', 'л', 'м', 'н', 'о', 'п', 'р', 'с', 'т', 'у', 'ф', 'х', 'ц', 'ч', 'ш', 'щ', 'ъ', 'ы', 'ь', 'э', 'ю', 'я'], 'sample_rate': 16000}
[NeMo I 2024-04-12 07:15:14 nemo_logging:381] PADDING: 0
Transcribing: 0%| | 0/1 [00:00<?, ?it/s]
Traceback (most recent call last):
File "/GigaAM/ctc_inference.py", line 76, in <module>
main(
File "/GigaAM/ctc_inference.py", line 70, in main
transcription = model.transcribe([audio_path])[0]
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/GigaAM/venv/lib/python3.11/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/GigaAM/venv/lib/python3.11/site-packages/nemo/collections/asr/models/ctc_models.py", line 198, in transcribe
logits, logits_len, greedy_predictions = self.forward(
^^^^^^^^^^^^^
File "/GigaAM/venv/lib/python3.11/site-packages/nemo/core/classes/common.py", line 1087, in __call__
outputs = wrapped(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^
File "/GigaAM/venv/lib/python3.11/site-packages/nemo/collections/asr/models/ctc_models.py", line 543, in forward
encoder_output = self.encoder(audio_signal=processed_signal, length=processed_signal_length)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/GigaAM/venv/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/GigaAM/venv/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/GigaAM/venv/lib/python3.11/site-packages/nemo/core/classes/common.py", line 1087, in __call__
outputs = wrapped(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^
File "/GigaAM/venv/lib/python3.11/site-packages/nemo/collections/asr/modules/conformer_encoder.py", line 491, in forward
return self.forward_internal(
^^^^^^^^^^^^^^^^^^^^^^
File "/GigaAM/venv/lib/python3.11/site-packages/nemo/collections/asr/modules/conformer_encoder.py", line 571, in forward_internal
audio_signal = layer(
^^^^^^
File "/GigaAM/venv/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/GigaAM/venv/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/GigaAM/venv/lib/python3.11/site-packages/nemo/collections/asr/parts/submodules/conformer_modules.py", line 162, in forward
x = self.self_attn(query=x, key=x, value=x, mask=att_mask, pos_emb=pos_emb, cache=cache_last_channel)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/GigaAM/venv/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/GigaAM/venv/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/GigaAM/venv/lib/python3.11/site-packages/nemo/collections/asr/parts/submodules/multi_head_attention.py", line 238, in forward
matrix_bd = torch.matmul(q_with_bias_v, p.transpose(-2, -1))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 1.35 GiB. GPU 0 has a total capacity of 2.93 GiB of which 981.75 MiB is free. Including non-PyTorch memory, this process has 1.97 GiB memory in use. Of the allocated memory 1.78 GiB is allocated by PyTorch, and 104.70 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation. See documentation for Memory Management (https://pytorch.org/docs/stable/notes/cuda.html#environment-variables)
С маленькими файлами проблем не возникает:
(venv) root@dk04:~/GigaAM# file /mnt/rec/0121d02a-d33e-43dd-ab2e-eb7067d0fac7.wav
/mnt/rec/0121d02a-d33e-43dd-ab2e-eb7067d0fac7.wav: RIFF (little-endian) data, WAVE audio, Microsoft PCM, 16 bit, mono 8000 Hz
(venv) root@dk04:~/GigaAM# du -sh /mnt/rec/0121d02a-d33e-43dd-ab2e-eb7067d0fac7.wav
60K /mnt/rec/0121d02a-d33e-43dd-ab2e-eb7067d0fac7.wav
(venv) root@dk04:~/GigaAM# python3 ctc_inference.py --model_config ./data/ctc_model_config.yaml --model_weights ./data/ctc_model_weights.ckpt --device cuda --audio_path /mnt/rec/0121d02a-d33e-43dd-ab2e-eb7067d0fac7.wav
[NeMo W 2024-04-12 07:17:38 nemo_logging:393] /GigaAM/venv/lib/python3.11/site-packages/transformers/utils/generic.py:311: UserWarning: torch.utils._pytree._register_pytree_node is deprecated. Please use torch.utils._pytree.register_pytree_node instead.
torch.utils._pytree._register_pytree_node(
[NeMo W 2024-04-12 07:17:38 nemo_logging:393] Could not load dataset as `manifest_filepath` was None. Provided config : {'batch_size': 10, 'trim_silence': False, 'max_duration': 25.0, 'min_duration': 0.1, 'shuffle': True, 'is_tarred': False, 'num_workers': 8, 'pin_memory': True, 'manifest_filepath': None, 'labels': [' ', 'а', 'б', 'в', 'г', 'д', 'е', 'ж', 'з', 'и', 'й', 'к', 'л', 'м', 'н', 'о', 'п', 'р', 'с', 'т', 'у', 'ф', 'х', 'ц', 'ч', 'ш', 'щ', 'ъ', 'ы', 'ь', 'э', 'ю', 'я'], 'sample_rate': 16000}
[NeMo W 2024-04-12 07:17:38 nemo_logging:393] Could not load dataset as `manifest_filepath` was None. Provided config : {'batch_size': 20, 'shuffle': False, 'num_workers': 4, 'min_duration': 0.1, 'pin_memory': True, 'manifest_filepath': None, 'labels': [' ', 'а', 'б', 'в', 'г', 'д', 'е', 'ж', 'з', 'и', 'й', 'к', 'л', 'м', 'н', 'о', 'п', 'р', 'с', 'т', 'у', 'ф', 'х', 'ц', 'ч', 'ш', 'щ', 'ъ', 'ы', 'ь', 'э', 'ю', 'я'], 'sample_rate': 16000}
[NeMo W 2024-04-12 07:17:38 nemo_logging:393] Could not load dataset as `manifest_filepath` was None. Provided config : {'manifest_filepath': None, 'batch_size': 100, 'shuffle': False, 'num_workers': 4, 'pin_memory': True, 'labels': [' ', 'а', 'б', 'в', 'г', 'д', 'е', 'ж', 'з', 'и', 'й', 'к', 'л', 'м', 'н', 'о', 'п', 'р', 'с', 'т', 'у', 'ф', 'х', 'ц', 'ч', 'ш', 'щ', 'ъ', 'ы', 'ь', 'э', 'ю', 'я'], 'sample_rate': 16000}
[NeMo I 2024-04-12 07:17:38 nemo_logging:381] PADDING: 0
Transcribing: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 1.90it/s]
transcription: нету нету алло
С cpu так же проблем не возникало. Пробую на видеокарте:
# nvidia-smi
Fri Apr 12 07:10:44 2024
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.54.15 Driver Version: 550.54.15 CUDA Version: 12.4 |
|-----------------------------------------+------------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 NVIDIA GeForce GTX 1060 3GB On | 00000000:01:00.0 On | N/A |
| 0% 42C P8 7W / 120W | 1MiB / 3072MiB | 0% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
+-----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
| No running processes found |
+-----------------------------------------------------------------------------------------+
Подскажите пожалуйста, может быть какие-то параметры подкорректировать? График момента проблемы
root@dk04:~# nvtop
Device 0 [NVIDIA GeForce GTX 1060 3GB] PCIe GEN 1@16x RX: 1.000 KiB/s TX: 0.000 KiB/s
GPU 139MHz MEM 405MHz TEMP 43°C FAN 0% POW 7 / 120 W
GPU[ 0%] MEM[| 0.068Gi/3.000Gi]
┌──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
100│GPU0 % │
│GPU0 mem% │
│ │
│ │
│ │
75│ │
│ ┌─┐ │
│ │ │ │
│ │ │ │
│ │ │ │
50│ │ │ │
│ │ │ │
│ ┌─┘ │ │
│ │ │ │
│ │ │ │
25│ ┌┼──┐│ │
│ ││ ││ │
│ ││ ││ │
│ ││ ││ │
│ ││ ││ │
0│──────────────────────────────────────────────────────────────────────────────────────────┴┘ └┴──────────────────────────│
└61s──────────────────────────45s────────────────────────────30s───────────────────────────15s───────────────────────────0s┘
PID USER DEV TYPE GPU GPU MEM CPU HOST MEM Command