oshmpi icon indicating copy to clipboard operation
oshmpi copied to clipboard

Strided: Analyze overhead of strided datatype creation and decoding

Open minsii opened this issue 5 years ago • 0 comments

Description: Analyze overheads of strided datatype creation and decoding in the strided RMA path. Need to configure with --disable-strided-cache to disable the datatype cache optimization in OSHMPI.

Starting point:

  • Look into OSHMPI_create_strided_dtype function in OSHMPI.
  • Can use tests/iput.c as test program.

Hints: The datatype created in OSHMPI is always a resized vector with blocklength=1.

Evaluation platform: LCRC/Bebop Broadwell and KNL are preferred

Estimated effort: 3days - 1 week

TODOs:

  • [ ] Overhead breakdown of strided datatype creation in OSHMPI/MPICH/Yaksa path on CPU
  • [ ] Overhead breakdown of strided datatype creation in OSHMPI/MPICH/dataloop path on CPU (optional)
  • [ ] Overhead analysis of strided datatype decoding in PUT in OSHMPI/MPICH path (assume yaksa and dataloop are the same) on CPU
  • [ ] Overhead breakdown of strided datatype creation in OSHMPI/MPICH/Yaksa path on KNL
  • [ ] Overhead breakdown of strided datatype creation in OSHMPI/MPICH/dataloop path on KNL (optional)
  • [ ] Overhead analysis of strided datatype decoding in PUT in OSHMPI/MPICH path (assume yaksa and dataloop are the same) on KNL

minsii avatar May 14 '20 03:05 minsii