scylla-operator icon indicating copy to clipboard operation
scylla-operator copied to clipboard

CPU and smp calculation are wrong

Open dkropachev opened this issue 4 years ago • 2 comments

Describe the bug Spinning scylla cluster on GKE(n1-standard-8)[8 cpu cores] resulted in:

--smp=6 --cpuset=0-7

Resulted command line:

/usr/bin/scylla --log-to-syslog 0 --log-to-stdout 1 --default-log-level info --network-stack posix --io-properties-file=/etc/scylla.d/io_properties.yaml --cpuset 0-7 --smp 6 --listen-address 10.142.0.53 --rpc-address 10.142.0.53 --seed-provider-parameters seeds=10.3.243.243,10.3.246.86 --broadcast-address 10.3.246.86 --broadcast-rpc-address 10.3.246.86 --blocked-reactor-notify-ms 999999999

To Reproduce Steps to reproduce the behavior:

  1. Deploy Operator on GKE as it advised on ./examples/gke
  2. Deploy scylla
  3. Log into scylla node and see arguments scylla is running with

Expected behavior It is expected to be something like:

--smp=7 --cpuset=0-6

or

--smp=6 --cpuset=0-4,6-7

or

--cpuset=0-4,6-7

Logs

  • https://cloudius-jenkins-test.s3.amazonaws.com/04398b18-cc08-4f3b-8a64-3330751c1f2e/20201130_192731/db-cluster-04398b18.zip

Environment:

  • Platform: GKE
  • Kubernetes version: 1.15.12-gke.20
  • Scylla version: 4.2.0
  • Scylla-operator version: nightly

dkropachev avatar Dec 01 '20 08:12 dkropachev

I can confirm all issues pointed by @dkropachev (#282 #281 #280 #279). https://github.com/scylladb/scylla-operator/issues/280 and https://github.com/scylladb/scylla-operator/issues/279 are pretty basic stuff. 🤯

rogaha avatar Jun 06 '21 19:06 rogaha

SMP is calculated based on user provided resources, where cpuset is assigned by kubelet based on Pod QoS class. For Burstable and BestEffort cpuset will be set to all available cores at the host, because pods are getting shared CPU access, and they can be executed on all of the CPUs. For Guaranteed QoS, Pod gets CPU exclusively and only then smp number will match the cpuset.

zimnx avatar Jun 07 '21 11:06 zimnx

@tnozicka , @zimnx , could you please check if it is still relevant

dkropachev avatar Jun 06 '23 18:06 dkropachev

The Scylla Operator project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 30d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle stale
  • Close this issue with /close
  • Offer to help out

/lifecycle stale