CTranslate2 icon indicating copy to clipboard operation
CTranslate2 copied to clipboard

CUDA DeviceAllocate segfault

Open drzraf opened this issue 1 month ago • 3 comments

#0  0x00007bc0622c6554 in std::_Rb_tree_increment(std::_Rb_tree_node_base const*) () from /lib/x86_64-linux-gnu/libstdc++.so.6
No symbol table info available.
#1  0x00007bc05573e59a in cub::CachingDeviceAllocator::DeviceAllocate(int, void**, unsigned long, CUstream_st*) () from /home/.local/lib/python3.10/site-packages/ctranslate2.libs/libctranslate2.so.4
No symbol table info available.
#2  0x00007bc05573ea99 in ctranslate2::cuda::CubCachingAllocator::allocate(unsigned long, int) () from /home/.local/lib/python3.10/site-packages/ctranslate2.libs/libctranslate2.so.4
No symbol table info available.
#3  0x00007bc055712796 in ctranslate2::StorageView::reserve(long) () from /home/.local/lib/python3.10/site-packages/ctranslate2.libs/libctranslate2.so.4
No symbol table info available.
#4  0x00007bc0557127f8 in ctranslate2::StorageView::resize(std::vector<long, std::allocator<long> >) () from /home/.local/lib/python3.10/site-packages/ctranslate2.libs/libctranslate2.so.4
No symbol table info available.
#5  0x00007bc0556f59f2 in void ctranslate2::ops::MatMul::compute<(ctranslate2::Device)1, float>(ctranslate2::StorageView const&, ctranslate2::StorageView const&, ctranslate2::StorageView&) const ()
   from /home/.local/lib/python3.10/site-packages/ctranslate2.libs/libctranslate2.so.4
No symbol table info available.
#6  0x00007bc055660d24 in ctranslate2::layers::dot_product_attention(ctranslate2::StorageView const&, ctranslate2::StorageView const&, ctranslate2::StorageView const&, ctranslate2::StorageView const*, ctranslate2::StorageView const*, ctranslate2::StorageView const*, ctranslate2::StorageView const*, long, ctranslate2::StorageView&, ctranslate2::StorageView*, bool, float, bool, bool, long, ctranslate2::layers::Alibi*, ctranslate2::StorageView*) () from /home/.local/lib/python3.10/site-packages/ctranslate2.libs/libctranslate2.so.4
No symbol table info available.
#7  0x00007bc05566208d in ctranslate2::layers::MultiHeadAttention::operator()(ctranslate2::StorageView const&, ctranslate2::StorageView const&, ctranslate2::StorageView const*, ctranslate2::StorageView&, ctranslate2::StorageView*, ctranslate2::StorageView*, ctranslate2::StorageView*, ctranslate2::Padder const*, ctranslate2::Padder const*, bool, ctranslate2::StorageView*, long) const ()
   from /home/.local/lib/python3.10/site-packages/ctranslate2.libs/libctranslate2.so.4

CT2_VERBOSE=3 LD_LIBRARY_PATH=/home/.local/lib/python3.10/site-packages/ctranslate2.libs whisper-ctranslate2 --language=en --verbose=true --model small -f srt --output_dir /tmp/ foo.mp4

+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.67                 Driver Version: 550.67         CUDA Version: 12.4     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA GeForce 940MX           Off |   00000000:01:00.0 Off |                  N/A |
| N/A   50C    P8             N/A /  200W |    1988MiB /   2048MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------
  • small model (sadly) doesn't hold within my 2GB GPU but causes a segfault instead of failing properly.
  • happen with both the wheel and a hand-compiled .so
  • tiny model works (no OOM)
  • Important and unexpected useful workaround: Setting CT2_CUDA_ALLOW_BF16=1 CT2_CUDA_ALLOW_FP16=1 I could get small to run successfully on this GPU (!)

drzraf avatar May 25 '24 03:05 drzraf