magpie Re-visit estimates on thread counts, process limits, file descriptor limits, etc.

Re-visit estimates on thread counts, process limits, file descriptor limits, etc.

Open chu11 opened this issue 7 years ago • 2 comments

Newer systems may have hyper threading on and core counts are now much larger than before. Estimates for the number of threads daemons should have to handle communications (for example NameNode, DataNode, etc.) were previously estimates based on node count. These estimates may now be out of date and need to be calculated differently. Revisit calculations for these estimates.

In addition, max number of tasks (such as in Hadoop or Spark) may also need to be re-estimated. While in the past it may have been common to have 8-24 cores, with hyper threading 48-64 is not that unreasonable. The trade off of more threads/tasks may no longer be balanced in favor of big data applications. Re-consider how max threads/tasks per node is determined.

Mar 25 '17 00:03 chu11

In addition, prior estimates on process fd limits, process limits, etc. may need to be reconsidered

Apr 18 '17 23:04 chu11

In addition, default reducer counts in Terasort. Is 2 per node a reasonable one anymore?

Apr 26 '17 14:04 chu11

magpie magpie copied to clipboard

Re-visit estimates on thread counts, process limits, file descriptor limits, etc.

magpie
magpie copied to clipboard