chiapos icon indicating copy to clipboard operation
chiapos copied to clipboard

Allow limiting number of plotting processes during Stage 1

Open zmeyc opened this issue 3 years ago • 5 comments

This PR is based on unmerged https://github.com/Chia-Network/chiapos/pull/176 Please review only the last commit, I'll rebase once 176 is merged. Can be merged separately if needed. Will require adding commandline switches in chia-blockchain as well.

Sample use case: when starting 16 plotters (each using 4 threads) on 16 thread CPU, it's best to stagger their runs to not lock the computer down during stage 1 (multithreading is supported only in stage 1).

Currently this has to be done with external script:

  1. Measure approximate stage 1 time
  2. Run 4 plotters
  3. Sleep measured time
  4. Run next 4 plotters
  5. Sleep measured time etc

But this setup is error-prone:

  • on any slowdown plotter groups will clash
  • It's not possible to restart crashed plotters because all timings will be off

As a solution, this patch uses lock files to allow only up to a specified number of processes in Stage 1. Other plotters will wait for their turn. Plotters can be stopped and restarted at any time without consequences. Locks (slots) are automatically released if plotters crash or are stopped.

How to test:

./ProofOfSpace create -k 22 --p1maxproc 2 --runtimedir . -d dest1
./ProofOfSpace create -k 22 --p1maxproc 2 --runtimedir . -d dest2
./ProofOfSpace create -k 22 --p1maxproc 2 --runtimedir . -d dest3

Runtimedir should be the same (on prod can default to ~/.config/chia/run) Third process will wait until first two leave Stage 1.

zmeyc avatar Mar 09 '21 21:03 zmeyc

I would expect that setting the priority to the threads really low (or high nice-level) would also solve this problem, or at least mitigate it.

Have you tested that?

If that would work well. it would be a bit more scalable going forward. We may want to parallelize more steps, like sorting, in the future. And if the OS scheduler could sort this out itself, it would be ideal.

arvidn avatar Mar 20 '21 16:03 arvidn

@arvidn Interesting idea, I haven't tried that. it might help with GUI not getting frozen when overprovisioning threads, I'll check if it helps. But it will still result in Phase 1 taking longer overall if all processes will be simultaneously in it. It might be worth setting low priority as additional measure.

Another thing of concern is adding more args to Python bindings. Possibly it's better to pass params as struct in CreatePlotDisk? For example, Deno's linter has a rule A function that is part of the public API takes 0-2 required arguments, plus (if necessary) an options object (so max 3 total). I try to follow this not only in TS, imo it's more error-prone because params are passed by name and order doesn't matter. It will be easier to add/remove fields too.

zmeyc avatar Mar 21 '21 12:03 zmeyc

Required changes in chia-blockchain: https://github.com/zmeyc/chia-blockchain/commit/46bb5f397d6017fb80e76635b9dc2d73d4ac4085

zmeyc avatar Mar 21 '21 12:03 zmeyc

Lowering priorities but setting them to different values for each plotter could also work. Theoretically this will allow for one plotter to be prioritized and finish faster and not stall all of them.

zmeyc avatar Mar 22 '21 09:03 zmeyc

'This PR has been flagged as stale due to no activity for over 60 days. It will not be automatically closed, but it has been given a stale-pr label and should be manually reviewed.'

github-actions[bot] avatar Aug 12 '21 11:08 github-actions[bot]