How to control GPU capcity allocation
OutOfMemoryError: CUDA out of memory. Tried to allocate 1.36 GiB. GPU 0 has a total capacty of 5.77 GiB of which 284.31 MiB is free. Including non-PyTorch memory, this process has 5.48 GiB memory in use. Of the allocated memory 5.34 GiB is allocated by PyTorch, and 22.94 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
i just want it to do a Speech2Text translation, and i don't mind if it takes minutes or hours for a larger input
Seamless doesn't support large inputs. You need to split the audio to chunks below 30 seconds before feeding it to the model.