dice icon indicating copy to clipboard operation
dice copied to clipboard

Allow Configurable Number of CPU Cores for Multi-Threading in DiceDB

Open dograprabhav opened this issue 1 year ago • 5 comments

Was going through the code of DiceDB, specifically this snippet:

	var numCores int
	if config.EnableMultiThreading {
		serverErrCh = make(chan error, 1)
		numCores = runtime.NumCPU()
		logr.Debug("The DiceDB server has started in multi-threaded mode.", slog.Int("number of cores", numCores))
	} else {
		serverErrCh = make(chan error, 2)
		logr.Debug("The DiceDB server has started in single-threaded mode.")
		numCores = 1
	}
	runtime.GOMAXPROCS(numCores)

Currently, DiceDB dynamically configures its threading model based on the EnableMultiThreading flag in the configuration, defaulting to the number of logical CPU cores available via runtime.NumCPU() when multi-threading is enabled. However, this approach can be problematic in containerized environments, such as Kubernetes, where resource constraints (e.g., CPU limits) are enforced per container.

In Kubernetes, a container can be allocated a specific number of CPU cores, and using runtime.NumCPU() may not reflect these limits accurately. For example, if the host machine has 16 cores but a container is only allocated 4 cores, DiceDB would still use all 16 cores unless GOMAXPROCS is manually set. This could lead to resource contention, poor performance, or overuse of allocated resources, resulting in throttling by the container orchestrator.

Issue Summary:

  • Current behavior: DiceDB defaults to using all available CPU cores on the host machine (runtime.NumCPU()). Problem: In environments like Kubernetes, containers may have restricted CPU allocations, and the use of runtime.NumCPU() could exceed the allocated resources.
  • Proposed solution: Introduce a configuration option allowing the user to specify the number of cores to use dynamically, either through environment variables or a dedicated configuration field. This would ensure DiceDB respects the container's CPU allocation, preventing resource overuse and improving compatibility with cloud-native environments.

I can pick up these changes, can you assign this issue to me @lucifercr07 @JyotinderSingh @AshwinKul28 @arpitbbhayani

dograprabhav avatar Oct 04 '24 13:10 dograprabhav

@dograprabhav I was thinking that workloads such as databases would run on dedicated nodes. We are designing DiceDB to be running optimally on modern multi-core hardware extracting the max performance out of it. However I agree, we need a more sophisticated logic to run DiceDB that allocating all cores to it blindly. Irrespective this change may be helpful in test environments. @arpitbbhayani @AshwinKul28 @lucifercr07 wdyt?

soumya-codes avatar Oct 04 '24 13:10 soumya-codes

Adding it to icebox for now.

lucifercr07 avatar Oct 05 '24 10:10 lucifercr07

@lucifercr07 can I take this up?

hprasad99 avatar Oct 07 '24 19:10 hprasad99

optional config for num cpu sounds like a good idea can I solve this?

probablyArth avatar Oct 09 '24 16:10 probablyArth

This seems to be resolved now, can close this

prabhavdogra avatar Mar 04 '25 12:03 prabhavdogra