heli

Results 7 issues of heli

Hello, now "memcpy_htod_async" function only support paramter: `pycuda.driver.memcpy_htod_async(dest, src, stream=None)`. Can you extends this api with additional paramter: `size`? Here `size` means how many bytes will be copyed. Or is...

### What operating system are you using? windows ### What browser are you using? dege ### Describe the bug 聊天页面太窄建议给选项调整宽度 ### What prompt did you enter? _No response_ ### Console...

**Is your feature request related to a problem? Please describe.** * Normally, we would like to set log verbose=1 for printing the request logs to stdout, like the following image:...

This commit introduces a debug mode for the Triton Python backend. When the environment variable `TRITON_DEBUG` is set to "1", the backend will import the `debugpy` module and start listening...

### Is this the right place to submit this? - [X] This is not a security vulnerability or a crashing bug - [X] This is not a question about how...

feature/Multi-cluster

**Description** I am observing a difference in the behavior of TritonServer when starting it with `mpirun` compared to starting it directly. Specifically, when I use `mpirun --allow-run-as-root -n 1 /opt/tritonserver/bin/tritonserver`,...

* Currently, with time-slicing or MPS GPU-sharing technology, multiple processes simultaneously occupy GPU memory, preventing a single process from utilizing all the memory. Is there any technology or configuration that...