[Controller] panic when create kvcache
🐛 Describe the bug
use example kvcache yaml create a kvcache, controller-manager was panic.
Steps to Reproduce
apply example kvcache file from: https://github.com/vllm-project/aibrix/blob/main/config/samples/orchestration_v1alpha1_kvcache.yaml
then controller-manager panic, log details:
I0520 11:21:15.502966 1 controller.go:115] "msg"="Observed a panic in reconciler: cannot parse '': quantities must match the regular expression '^([+-]?[0-9.]+)([eEinumkKMGTP]*[-+]?[0-9]*)$'" "KVCache"={"name":"aibrix-deepseek-coder-33b-kvcache","namespace":"aibrix-demo"} "controller"="kv-cache-controller" "controllerGroup"="orchestration.aibrix.ai" "controllerKind"="KVCache" "name"="aibrix-deepseek-coder-33b-kvcache" "namespace"="aibrix-demo" "reconcileID"="8881e4c3-1049-4b1f-b034-15a4604dc024"
panic: cannot parse '': quantities must match the regular expression '^([+-]?[0-9.]+)([eEinumkKMGTP]*[-+]?[0-9]*)$' [recovered]
panic: cannot parse '': quantities must match the regular expression '^([+-]?[0-9.]+)([eEinumkKMGTP]*[-+]?[0-9]*)$'
goroutine 499 [running]:
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile.func1()
/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:116 +0x1e5
panic({0x1bbc0c0?, 0xc0006aa7e0?})
/usr/local/go/src/runtime/panic.go:770 +0x132
k8s.io/apimachinery/pkg/api/resource.MustParse({0x0, 0x0})
/go/pkg/mod/k8s.io/[email protected]/pkg/api/resource/quantity.go:139 +0x173
github.com/aibrix/aibrix/pkg/controller/kvcache.(*KVCacheReconciler).reconcileDeployment(0xc0009dc090, {0x21a30b0, 0xc00159af90}, 0xc0011aa008)
/workspace/pkg/controller/kvcache/kvcache_controller.go:510 +0xdbc
github.com/aibrix/aibrix/pkg/controller/kvcache.(*KVCacheReconciler).Reconcile(0xc0009dc090, {0x21a30b0, 0xc00159af90}, {{{0xc00088a830?, 0x0?}, {0xc0001e86c0?, 0xc000b99d10?}}})
/workspace/pkg/controller/kvcache/kvcache_controller.go:167 +0x15c
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile(0x21a74f0?, {0x21a30b0?, 0xc00159af90?}, {{{0xc00088a830?, 0xb?}, {0xc0001e86c0?, 0x0?}}})
/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:119 +0xb7
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler(0xc0007ff9a0, {0x21a30e8, 0xc00053c690}, {0x1cc4200, 0xc00101e380})
/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:316 +0x3bc
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem(0xc0007ff9a0, {0x21a30e8, 0xc00053c690})
/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:266 +0x1be
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2()
/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:227 +0x79
created by sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2 in goroutine 390
Expected behavior
success create kvcache object
Environment
- AIBrixs Version: v0.2.1
- Deployment env: Kubebernetes
@yyzxw could you try latest v0.3.0-rc.1 release instead of v0.2.1? They are not compatible. the sample linked above is latest but v0.2.1 still use legacy code
Panic occurs when creating KV cache with AIBrix v0.4.0
kvcache yaml: https://github.com/vllm-project/aibrix/blob/main/samples/kvcache/vineyard/kvcache.yaml
logs:
E0829 03:16:42.770905 1 signal_unix.go:881] "msg"="Observed a panic" "error"=null "KVCache"={"name":"deepseek-coder-7b-kvcache","namespace":"default"} "controller"="kv-cache-controller" "controllerGroup"="orchestration.aibrix.ai" "controllerKind"="KVCache" "name"="deepseek-coder-7b-kvcache" "namespace"="default" "panic"="runtime error: invalid memory address or nil pointer dereference" "panicGoValue"="\"invalid memory address or nil pointer dereference\"" "reconcileID"="eda3563d-7adb-4eb4-b49c-e4961819da3b" "stacktrace"="goroutine 7228 [running]:\nk8s.io/apimachinery/pkg/util/runtime.logPanic({0x2906e00, 0xc004129680}, {0x1f3b2a0, 0x3b85580})\n\t/go/pkg/mod/k8s.io/[email protected]/pkg/util/runtime/runtime.go:107 +0xbc\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller[...]).Reconcile.func1()\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:105 +0x112\npanic({0x1f3b2a0?, 0x3b85580?})\n\t/usr/local/go/src/runtime/panic.go:770 +0x132\ngithub.com/vllm-project/aibrix/pkg/controller/kvcache/backends.(*VineyardReconciler).reconcileMetadataService(0xc000580808?, {0x2906e00?, 0xc004129680?}, 0x1?)\n\t/workspace/pkg/controller/kvcache/backends/vineyard.go:70 +0x61\ngithub.com/vllm-project/aibrix/pkg/controller/kvcache/backends.VineyardReconciler.Reconcile({0xc003d88100}, {0x2906e00, 0xc004129680}, 0xc00204ab40)\n\t/workspace/pkg/controller/kvcache/backends/vineyard.go:45 +0x30\ngithub.com/vllm-project/aibrix/pkg/controller/kvcache.(*KVCacheReconciler).Reconcile(0xc003c7a5f0, {0x2906e00, 0xc004129680}, {{{0xc00423cb80?, 0x1a1e35a?}, {0xc005448200?, 0x0?}}})\n\t/workspace/pkg/controller/kvcache/kvcache_controller.go:163 +0x2bd\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller[...]).Reconcile(0xc001c8d440?, {0x2906e00?, 0xc004129680?}, {{{0xc00423cb80?, 0x0?}, {0xc005448200?, 0x0?}}})\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:116 +0xd4\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller[...]).reconcileHandler(0x2922240, {0x2906e38, 0xc0004e1540}, {{{0xc00423cb80, 0x7}, {0xc005448200, 0x19}}})\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:303 +0x3bc\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller[...]).processNextWorkItem(0x2922240, {0x2906e38, 0xc0004e1540})\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:263 +0x21d\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller[...]).Start.func2.2()\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:224 +0x8a\ncreated by sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller[...]).Start.func2 in goroutine 7078\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:220 +0x490\n"
E0829 03:16:42.771003 1 controller.go:316] "msg"="Reconciler error" "error"="panic: runtime error: invalid memory address or nil pointer dereference [recovered]" "KVCache"={"name":"deepseek-coder-7b-kvcache","namespace":"default"} "controller"="kv-cache-controller" "controllerGroup"="orchestration.aibrix.ai" "controllerKind"="KVCache" "name"="deepseek-coder-7b-kvcache" "namespace"="default" "reconcileID"="eda3563d-7adb-4eb4-b49c-e4961819da3b"
Possible cause:watcher is not configured, but tries to access Watcher.env:
https://github.com/vllm-project/aibrix/blob/main/pkg/controller/kvcache/backends/hpkv.go#L135 https://github.com/vllm-project/aibrix/blob/main/pkg/controller/kvcache/backends/infinistore.go#L130
Besides, I can try to fix this bug。
@Jeffwan PTAL
@zhixian82 thanks for this. you can submit pr 😄