pycuda icon indicating copy to clipboard operation
pycuda copied to clipboard

CUDA integration for Python, plus shiny features

Results 94 pycuda issues
Sort by recently updated
recently updated
newest added

**Is your feature request related to a problem? Please describe.** We recently identified a (GPU) memory leak in a routine that creates a new cuda stream on a given context...

enhancement

Here's an MWE ```python >>> import numpy as np >>> import pycuda.autoinit >>> import pycuda.gpuarray as gpuarray >>> a = np.array(True) >>> b = np.array(False) >>> a_gpu = gpuarray.to_gpu(a) >>>...

bug

Hello, in order to test Python 3.11 with pycuda, i have just installed latest pycuda version : 2022.1 I use CUDA 11.6.2 with a windows 11 laptop. When i try...

bug

Here's an MWE: ```python >>> import pycuda.autoinit >>> import pycuda.gpuarray as cu_np >>> a = cu_np.zeros(10, dtype="int32") + 1 >>> b = cu_np.zeros(10, dtype="int32") + 2 >>> a / b...

bug

Hi there! I wanted to experiment with CUDA Graphs a bit to get a feel for the performance differences between blocking, async and graph execution. See: * https://developer.nvidia.com/blog/cuda-graphs/ * https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#cuda-graphs...

Basically I want to achieve concurrent work with multithreading and my current inference code is pycuda + tensorrt. **why I want to do so** I'm trying to optimize the inference...

Solves https://github.com/inducer/pycuda/issues/319 I'm not exactly sure my work is proper, I'm more a python guy than C++. Can you share your thoughts @inducer please ?

People like me always forgot to pop context and cause errors. By implementing these two methods, the python will handle the context automatically. See https://book.pythontips.com/en/latest/context_managers.html This also helps when there...

I have created a Streamlit App to as a demo of a project on Multilingual Text Classification using mBERT in PyTorch. When I run the app with the command `python...

**Describe the bug** I want to initialize as many cuda contexts as possible in a multi-threaded environment, but when cuda.Device(0).make_context() throws an exception, the GPU memory allocated by make_context cannot...

bug