nest-simulator icon indicating copy to clipboard operation
nest-simulator copied to clipboard

Add documentation regarding HPC systems

Open jessica-mitchell opened this issue 2 years ago • 2 comments

This PR adds documentation regarding HPC including

  • a general overview of hardware and software components
  • an example Slurm script
  • explanation of MPI and threading
  • reference to beNNch for benchmarking

jessica-mitchell avatar May 20 '22 11:05 jessica-mitchell

Pull request automatically marked stale!

github-actions[bot] avatar Jul 20 '22 08:07 github-actions[bot]

@jessica-mitchell, could you comment on the status of this one? The mentioned manuscripts have meanwhile been published

and maybe the workflow for optimizations on HPC would also be interesting to mention:

terhorstd avatar Aug 02 '22 07:08 terhorstd

Pull request automatically marked stale!

github-actions[bot] avatar Nov 12 '22 08:11 github-actions[bot]

This PR is ready for review.

@ackurth @JoseJVS I have a couple of spots with questions as to how to complete the statement, or if the statement is true. Particularly in the slurm_script.rst and mpi_process.rst.

In addition to the parts on opitimizng HPC for NEST there is also a benchmarking part on beNNch that @jasperalbers contributed to.

See output here https://nest-test.readthedocs.io/en/add-hpc-docs/hpc/optimizing_nest.html#optimize-performance https://nest-test.readthedocs.io/en/add-hpc-docs/hpc/benchmarking.html#benchmark

jessica-mitchell avatar Dec 01 '22 08:12 jessica-mitchell

@ackurth Can you also take a look at the text?

jessica-mitchell avatar Dec 07 '22 08:12 jessica-mitchell

@jessica-mitchell I think we should reconsider the intention with these guides, showing how NEST can be optimized with scaling the number of threads and using multiple MPI processes is a good thing, however I think including a guide on thread pinning/process binding is a delicate topic. Although @ackurth explored the performance impact of different thread pinning and process binding affinity schemes in his paper, these types of configurations are not simple to explain and are tightly coupled with the types of machines the user has access to. So trying to explain these topics on short guides would be complicated and worst case scenario would be misleading for the reader. I think for this it would be better to point to the sub-real time paper https://doi.org/10.1088/2634-4386/ac55fc and maybe some additional literature.

JoseJVS avatar Dec 07 '22 09:12 JoseJVS

@JoseJVS @ackurth I think I made all the suggested changes, as per our discussion a few weeks ago. Many thanks for your insights :) Here is the output https://nest-test.readthedocs.io/en/add-hpc-docs/hpc/optimizing_nest.html#optimize-performance. @ackurth I believe you still have one thing you want to add to the document?

jessica-mitchell avatar Feb 17 '23 15:02 jessica-mitchell

Pull request automatically marked stale!

github-actions[bot] avatar Apr 19 '23 08:04 github-actions[bot]