HAMi
HAMi copied to clipboard
[bug] Low-priority tasks will not be blocked.
When there are two pods at the same node, one pod A is set to nvidia.com/priority: "0"
, and another pod B is set to nvidia.com/priority: "1"
.
Then I run the following program in Pod B and find that it can still run. It will not be blocked.
#include <stdio.h>
#include <unistd.h>
const int N = 16;
const int blocksize = 16;
__global__
void hello(char *a, int *b)
{
a[threadIdx.x] += b[threadIdx.x];
}
int main()
{
char a[N] = "Hello \0\0\0\0\0\0";
int b[N] = {15, 10, 6, 0, -11, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0};
char *ad;
int *bd;
const int csize = N*sizeof(char);
const int isize = N*sizeof(int);
printf("%s", a);
cudaMalloc( (void**)&ad, csize );
cudaMalloc( (void**)&bd, isize );
cudaMemcpy( ad, a, csize, cudaMemcpyHostToDevice );
cudaMemcpy( bd, b, isize, cudaMemcpyHostToDevice );
dim3 dimBlock( blocksize, 1 );
dim3 dimGrid( 1, 1 );
hello<<<dimGrid, dimBlock>>>(ad, bd);
cudaMemcpy( a, ad, csize, cudaMemcpyDeviceToHost );
cudaFree( ad );
cudaFree( bd );
printf("%s\n", a);
sleep(10);
return 0;
}
Then I also printed the values in the share cache, confirming that the Pod should be blocked.
root@cuda-12-runtime-6d7cb75b56-7xs68:~# ./mmap_read --filename=/usr/local/vgpu/c11a0a04-ade9-461f-994e-e7f5a8e448b8.cache
cachestr=
initializedFlag 19920718
smInitFlag 0
ownerPid 0
sem {[0 0 0 0 1 0 0 0 0 0 0 0 128 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]}
num 1
uuids [uuid=GPU-26a583dd-542e-09bb-5dd1-9cc5bd6eb552 ]
limit [157286400 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
sm_limit [10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10]
procnum 0
utilizationSwitch 1
recentKernel -1
priority 1
procs [
pid=292, hostpid=624912, used=[ ], monitorused=[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0], status=1
pid=216, hostpid=0, used=[ ], monitorused=[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0], status=1
root@cuda-12-runtime-6d7cb75b56-7xs68:~# time ./hello
[4pdvGPU Msg(292:139722028023808:libvgpu.c:869)]: Initializing.....
[4pdvGPU Warn(292:139722028023808:utils.c:228)]: get default cuda 1 from (null)
[4pdvGPU Msg(292:139722028023808:libvgpu.c:902)]: Initialized
Hello Hello
[4pdvGPU Msg(292:139722028023808:multiprocess_memory_limit.c:477)]: Calling exit handler 292
real 0m11.668s
user 0m0.205s
sys 0m0.330s