celeritas
celeritas copied to clipboard
Add MPI/ensemble support to distribute events across GPUs and compute nodes
To run large jobs on leadership-class machines, we'll need to parallelize across nodes by distributing events using MPI. A first attempt could just equally partition among tasks, but we should also consider investigating whether any MPI framework has a queue-like model that can dispatch events to waiting nodes (or "rebalance" events in case some processors get unlucky).
- [ ] Split events among Runner to avoid replicating input
- [ ] Write JSON output to one file per process, rather than stdout
We're not going to have this in time to do any big runs on Summit for the SciDAC proposal, so let's defer to Q3.