nni icon indicating copy to clipboard operation
nni copied to clipboard

NNI not running anymore without error messages when CPU reached 100% once

Open BitCalSaul opened this issue 9 months ago • 0 comments

Describe the issue: Once the CPU utilization reached 100% once, even though NNI will finish the running trials but will not run the remaining trials.

Environment:

  • NNI version: 3.0
  • Training service (local|remote|pai|aml|etc): local
  • Client OS: ubuntu 22.04, 20.04
  • Server OS (for remote mode only):
  • Python version: 3.8
  • PyTorch/TensorFlow version: 2.1.0
  • Is conda/virtualenv/venv used?: conda
  • Is running in Docker?: no

How to reproduce it?: You could run a task that consumes CPU resources across multiple trials simultaneously, and you will observe this issue.

I think this issue is as the same as this one #965 .

BitCalSaul avatar May 02 '24 02:05 BitCalSaul