hermes
hermes copied to clipboard
Multi-node IOR benchmark on Chameleon
We want to model a checkpoint/restart workload.
- 4 client nodes, with 40 ranks each (out of 48)
- 4 PFS server nodes (HDD)
- 2 BB server nodes (SSD)
- Run Hermes as a daemon
- Run
ior -wto simulate a checkpoint - Run
ior -rto simulate a restart - For the baseline, the checkpoint phase will exit once the data is flushed to PFS, and the restart phase will read from PFS.
- Hermes will store the checkpoint in the hierarchy and we should see faster write and read.