rongfu.leng issues

Results 92 issues of


                                            rongfu.leng

pod schedule to no gpu node.

1. having two nodes, a node having GPU, another node not GPU. 2. apply a use nvidia.com.gpu resources pod yaml, some time pod can schedule to no GPU node.

issue/stale

add gpu topology aware when use more then one gpu device

kind/feature

release version hope we can upgrade helm chart version

issue/stale

Add CUDA MPS for shared access GPUs

https://github.com/NVIDIA/k8s-device-plugin?tab=readme-ov-file#with-cuda-mps

Support device path mapping and permission in ctr

issue: https://github.com/containerd/containerd/issues/5046 Usage: UNIX: ``` ctr run --device= ctr run --device=: ctr run --device=:: ``` Window: (Support cli to set HOST PATH, don't to set CONTAINER PATH) ``` ctr run...

needs-ok-to-test

size/S

In webhook add match condition by scheduler name filter pod resources

Add `matchConditions` in webhook when rules is pod resources create operation, to validate `object.spec.schedulerName` whether is default volcano scheduler name `volcano`, current we don't consider scheduler name change. /kind feature...

kind/feature

ok-to-test

size/S

When volcano-admission pod not running, create other pod can faild

### Description When volcano-admission pod crash, It will affect me creating other pods. ### Steps to reproduce the issue 1. install volcano use helm install 2. scale volcano-admission replicas to...

help wanted

good first issue

kind/bug

rongfu.leng