Multi-node IOR benchmark on Chameleon

Open ChristopherHogan opened this issue 4 years ago • 0 comments

We want to model a checkpoint/restart workload.

4 client nodes, with 40 ranks each (out of 48)
4 PFS server nodes (HDD)
2 BB server nodes (SSD)
Run Hermes as a daemon
Run ior -w to simulate a checkpoint
Run ior -r to simulate a restart
For the baseline, the checkpoint phase will exit once the data is flushed to PFS, and the restart phase will read from PFS.
Hermes will store the checkpoint in the hierarchy and we should see faster write and read.

Oct 28 '21 14:10 ChristopherHogan