valafon

Results 9 issues of valafon

I’ve been utilizing the MPS (Multi-Process Service) daemon to manage resource usage limits for processes using the CUDA_MPS_ACTIVE_THREAD_PERCENTAGE and CUDA_MPS_PINNED_DEVICE_MEM_LIMIT environment variables, and it’s been working well. However, I’ve encountered...

Hello! I have launched the gpu-manager daemon set on a node. Then, I started a pod on this node which requested tencent.com/vcuda-memory:2. As I understand from the README, 1 vcuda...

If we delete the team and recreate it by script, team members don't restore. As I see, membership is restored through the ID of a team. But when the team...

I have such an example - there is a server with 8 GPU's, each separated by 5 vgpu's. So I have a total of 40 vgpu for free node. Then...

I’ve been utilizing the MPS (Multi-Process Service) daemon to manage resource usage limits for processes using the CUDA_MPS_ACTIVE_THREAD_PERCENTAGE and CUDA_MPS_PINNED_DEVICE_MEM_LIMIT environment variables, and it’s been working well. However, I’ve encountered...

lifecycle/stale

### Proposal Add a "Restart All Allocations" button to the main window of the job in the WEB UI. For an example, I'm attaching a screenshot of how I see...

type/enhancement
theme/ui
stage/needs-discussion

When I try to create pods using the plugin, it sequentially fills the first GPU, and only when the first one is filled, it starts filling the second one. At...

The plugin allows using GPU memory for pod scheduling. Is it possible to add CPU as well to the scheduling so that both parameters participate in the scheduling? The CPU...

We have the following problem - we need to deploy a job in an environment where nodes can be either amd64 or arm64. This isn't an issue when the jobspec...

type/enhancement