autoscaling
autoscaling copied to clipboard
0/3 nodes are available: 1 node(s) had untolerated taint {node-role.kubernetes.io/control-plane: }, 2 Insufficient neonvm/kvm.
I followed https://github.com/neondatabase/autoscaling?tab=readme-ov-file#building-and-running
postgres16-disk-test always pending
root@iZbp19lce9chqq1glegm26Z:~/serverless/neon/autoscaling# kubectl get pod
NAME READY STATUS RESTARTS AGE
postgres16-disk-test-b88qv 0/2 Pending 0 6m1s
root@iZbp19lce9chqq1glegm26Z:~/serverless/neon/autoscaling# kubectl describe pod postgres16-disk-test-b88qv
Name: postgres16-disk-test-b88qv
...
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 6m10s autoscale-scheduler 0/3 nodes are available: 1 node(s) had untolerated taint {node-role.kubernetes.io/control-plane: }, 2 Insufficient neonvm/kvm.
Warning FailedScheduling 45s autoscale-scheduler 0/3 nodes are available: 1 node(s) had untolerated taint {node
autoscale-scheduler logs as follows
{"level":"error","ts":1713319685.30246,"logger":"autoscale-scheduler.plugin","caller":"plugin/plugin.go:343","msg":"Pod rejected by all Filter method calls","method":"Filter","virtualmachine":{"namespace":"default","name":"postgres16-disk-test"},"pod":{"namespace":"default","name":"postgres16-disk-test-b88qv"},"stacktrace":"github.com/neondatabase/autoscaling/pkg/plugin.(*AutoscaleEnforcer).
PostFilter\n\t/workspace/pkg/plugin/plugin.go:343\nk8s.io/kubernetes/pkg/scheduler/framework/runtime.(*frameworkImpl).runPostFilterPlugin\n\t/go/pkg/mod/k8s.io/[email protected]/pkg/scheduler/framework/runtime/framework.go:776\nk8s.io/kubernetes/pkg/scheduler/framework/runtime.(*frameworkImpl).RunPostFilterPlugins\n\t/go/pkg/mod/k8s.io/[email protected]/pkg/scheduler/framework/runtime/framework.go:759\nk8s.io/kubernetes/pkg/scheduler.
(*Scheduler).scheduleOne\n\t/go/pkg/mod/k8s.io/[email protected]/pkg/scheduler/schedule_one.go:110\nk8s.io/apimachinery/pkg/util/wait.JitterUntilWithContext.func1\n\t/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:190\nk8s.io/apimachinery/pkg/util/wait.BackoffUntil.func1\n\t/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:157\nk8s.io/apimachinery/pkg/util/wait.BackoffUntil\n\t/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:158\nk8s.io/apimachinery/pkg/util/wait.JitterUntil\n\t/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:135\nk8s.io/apimachinery/pkg/util/wait.JitterUntilWithContext\n\t/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:190\nk8s.io/apimachinery/pkg/util/wait.UntilWithContext\n\t/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:101"}
root@iZbp19lce9chqq1glegm26Z:~/serverless/neon/autoscaling# more vm-deploy.yaml
---
apiVersion: vm.neon.tech/v1
kind: VirtualMachine
metadata:
name: postgres16-disk-test
annotations:
# In this example, these bounds aren't necessary. So... here's what they look like :)
autoscaling.neon.tech/bounds: '{ "min": { "cpu": 0.25, "mem": "1Gi" }, "max": { "cpu": 1.25, "mem": "1Gi" } }'
labels:
autoscaling.neon.tech/enabled: "true"
# Set to "true" to continuously migrate the VM (TESTING ONLY)
autoscaling.neon.tech/testing-only-always-migrate: "false"
spec:
schedulerName: autoscale-scheduler
enableSSH: true
guest:
cpus: { min: 0.25, use: 0.25, max: 0.25 }
memorySlotSize: 1Gi
memorySlots: { min: 1, use: 1, max: 1 }
rootDisk:
image: pg16-disk-test:dev
size: 1Gi
ports:
- port: 5432 # postgres
- port: 9100 # metrics
- port: 10301 # monitor
root@iZbp19lce9chqq1glegm26Z:~/serverless/neon/autoscaling# kubectl get neonvm
NAME CPUS MEMORY POD EXTRAIP STATUS RESTARTS AGE
postgres16-disk-test postgres16-disk-test-b88qv Pending 5h10m
root@iZbp19lce9chqq1glegm26Z:~/serverless/neon/autoscaling# kubectl get pod postgres16-disk-test-b88qv -ojson -ojson
...
"status": {
"conditions": [
{
"lastProbeTime": null,
"lastTransitionTime": "2024-04-17T02:08:05Z",
"message": "0/3 nodes are available: 1 node(s) had untolerated taint {node-role.kubernetes.io/control-plane: }, 2 Insufficient neonvm/kvm.",
"reason": "Unschedulable",
"status": "False",
"type": "PodScheduled"
}
],
"phase": "Pending",
"qosClass": "Burstable"
}
}
root@iZbp19lce9chqq1glegm26Z:~/serverless/neon/autoscaling# kubectl get node
NAME STATUS ROLES AGE VERSION
neonvm-root-control-plane Ready control-plane 23h v1.25.11
neonvm-root-worker Ready <none> 23h v1.25.11
neonvm-root-worker2 Ready <none> 23h v1.25.11
root@iZbp19lce9chqq1glegm26Z:~/serverless/neon/autoscaling# kubectl describe node neonvm-root-worker
Name: neonvm-root-worker
...
Allocated resources:
(Total limits may be over 100 percent, i.e., overcommitted.)
Resource Requests Limits
-------- -------- ------
cpu 5560m (34%) 18360m (114%)
memory 3118Mi (10%) 6030Mi (19%)
ephemeral-storage 0 (0%) 0 (0%)
hugepages-1Gi 0 (0%) 0 (0%)
hugepages-2Mi 0 (0%) 0 (0%)
neonvm/kvm 0 0
neonvm/vhost-net 0 0
Events: <none>
How can I make postgres16-disk-test scheduling successful?
I'm a beginner, thank you very much for your help.
In addition, which service of neon can postgres16-disk-test be compared to? Is there a way to simulate scaling testing?
Running pgbench
root@iZbp19lce9chqq1glegm26Z:~/serverless/neon/autoscaling# scripts/run-bench.sh
If you don't see a command prompt, try pressing enter.
fetch https://dl-cdn.alpinelinux.org/alpine/v3.19/community/x86_64/APKINDEX.tar.gz
(1/8) Installing postgresql-common (1.2-r1)
Executing postgresql-common-1.2-r1.pre-install
(2/8) Installing lz4-libs (1.9.4-r5)
(3/8) Installing libpq (16.2-r0)
(4/8) Installing ncurses-terminfo-base (6.4_p20231125-r0)
(5/8) Installing libncursesw (6.4_p20231125-r0)
(6/8) Installing readline (8.2.1-r2)
(7/8) Installing zstd-libs (1.5.5-r8)
(8/8) Installing postgresql16-client (16.2-r0)
Executing busybox-1.36.1-r15.trigger
Executing postgresql-common-1.2-r1.trigger
* Setting postgresql16 as the default version
OK: 12 MiB in 23 packages
Running pgbench. Query:
select length(factorial(length(factorial(1223)::text)/2)::text);
pgbench: error: too many command-line arguments (first is "postgres")
pgbench: hint: Try "pgbench --help" for more information.
pod "pgbench-postgres16-disk-test" deleted
pod default/pgbench-postgres16-disk-test terminated (Error)
root@iZbp19lce9chqq1glegm26Z:~# kubectl get pod -w
NAME READY STATUS RESTARTS AGE
postgres16-disk-test-b88qv 0/2 Pending 0 14m
pgbench-postgres16-disk-test 0/1 Pending 0 0s
pgbench-postgres16-disk-test 0/1 Pending 0 0s
pgbench-postgres16-disk-test 0/1 ContainerCreating 0 0s
pgbench-postgres16-disk-test 0/1 ContainerCreating 0 1s
pgbench-postgres16-disk-test 1/1 Running 0 16s
pgbench-postgres16-disk-test 0/1 Error 0 18s
pgbench-postgres16-disk-test 0/1 Error 0 20s
pgbench-postgres16-disk-test 0/1 Terminating 0 20s
pgbench-postgres16-disk-test 0/1 Terminating 0 20s
During Running pgbench, there were no neonvm resources, and an error occurred and exited.
root@iZbp19lce9chqq1glegm26Z:~# kubectl get neonvm
NAME CPUS MEMORY POD EXTRAIP STATUS RESTARTS AGE
postgres16-disk-test postgres16-disk-test-b88qv Pending 17m