volcano icon indicating copy to clipboard operation
volcano copied to clipboard

Fix some potential null pointer panic

Open coldzerofear opened this issue 9 months ago • 7 comments

There may be some potential null pointer panic during certain stages of the device sharing plugin lifecycle,We need to avoid as much as possible, reduce unnecessary null judgments, and improve stability

E0522 09:37:32.729272       1 node_info.go:396] "Idle resources turn into negative after allocated" nodeName="yjy-server" task="jupyter/d3018-fbd69-0" resources=["volcano.sh/vgpu-number"] idle="cpu 94900.00, memory 538562419712.00, pods 107.00, volcano.sh/vgpu-number -1000.00, attachable-volumes-csi-cephfs.csi.ceph.com 2147483647.00, ephemeral-storage 16605673168980000.00, hugepages-1Gi 0.00, hugepages-2Mi 0.00" req="cpu 1000.00, memory 2000000000.00, volcano.sh/vgpu-number 1000.00, pods 1.00"
E0522 09:37:32.729498       1 runtime.go:79] Observed a panic: "invalid memory address or nil pointer dereference" (runtime error: invalid memory address or nil pointer dereference)
goroutine 352 [running]:
k8s.io/apimachinery/pkg/util/runtime.logPanic({0x1f58d40?, 0x373f900})
        /go/pkg/mod/k8s.io/[email protected]/pkg/util/runtime/runtime.go:75 +0x99
k8s.io/apimachinery/pkg/util/runtime.HandleCrash({0x0, 0x0, 0xc0004723a0?})
        /go/pkg/mod/k8s.io/[email protected]/pkg/util/runtime/runtime.go:49 +0x75
panic({0x1f58d40, 0x373f900})
        /usr/local/go/src/runtime/panic.go:884 +0x213
volcano.sh/volcano/pkg/scheduler/api/devices/nvidia/vgpu.(*GPUDevices).GetStatus(0xc0007173c0?)
        /go/src/volcano.sh/volcano/pkg/scheduler/api/devices/nvidia/vgpu/metrics.go:71 +0x18
volcano.sh/volcano/pkg/scheduler/api/devices/nvidia/vgpu.(*GPUDevices).AddResource(0x0, 0xc000b10d80)
        /go/src/volcano.sh/volcano/pkg/scheduler/api/devices/nvidia/vgpu/device_info.go:148 +0xca
volcano.sh/volcano/pkg/scheduler/api.(*NodeInfo).addResource(0xc000f26600, 0xc000d155c0?)
        /go/src/volcano.sh/volcano/pkg/scheduler/api/node_info.go:496 +0xb9
volcano.sh/volcano/pkg/scheduler/api.(*NodeInfo).setNode(0xc000f26600, 0xc0004802c0)
        /go/src/volcano.sh/volcano/pkg/scheduler/api/node_info.go:383 +0x98b
volcano.sh/volcano/pkg/scheduler/api.(*NodeInfo).SetNode(0xc000b8e180, 0xc0002a1500?)
        /go/src/volcano.sh/volcano/pkg/scheduler/api/node_info.go:333 +0x54
volcano.sh/volcano/pkg/scheduler/cache.(*SchedulerCache).AddOrUpdateNode(0xc000236280, 0xc0004802c0)
        /go/src/volcano.sh/volcano/pkg/scheduler/cache/event_handlers.go:495 +0x125
volcano.sh/volcano/pkg/scheduler/cache.(*SchedulerCache).SyncNode(0xc000236280, {0xc00005d8d0, 0xa})
        /go/src/volcano.sh/volcano/pkg/scheduler/cache/event_handlers.go:612 +0x45b
volcano.sh/volcano/pkg/scheduler/cache.(*SchedulerCache).processSyncNode(0xc000236280)
        /go/src/volcano.sh/volcano/pkg/scheduler/cache/cache.go:1167 +0x1b6
volcano.sh/volcano/pkg/scheduler/cache.(*SchedulerCache).runNodeWorker(...)
        /go/src/volcano.sh/volcano/pkg/scheduler/cache/cache.go:1149
k8s.io/apimachinery/pkg/util/wait.BackoffUntil.func1(0x30?)
        /go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/backoff.go:226 +0x3e
k8s.io/apimachinery/pkg/util/wait.BackoffUntil(0x0?, {0x253f680, 0xc000ae0db0}, 0x1, 0xc0004e81e0)
        /go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/backoff.go:227 +0xb6
k8s.io/apimachinery/pkg/util/wait.JitterUntil(0x0?, 0x0, 0x0, 0x0?, 0x0?)
        /go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/backoff.go:204 +0x89
k8s.io/apimachinery/pkg/util/wait.Until(0x0?, 0x0?, 0x0?)
        /go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/backoff.go:161 +0x25
created by volcano.sh/volcano/pkg/scheduler/cache.(*SchedulerCache).Run
        /go/src/volcano.sh/volcano/pkg/scheduler/cache/cache.go:790 +0x92
panic: runtime error: invalid memory address or nil pointer dereference [recovered]
        panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x10 pc=0x17c8038]

coldzerofear avatar May 23 '24 05:05 coldzerofear