Paul Coffman comments

Results 23 comments of


                                            Paul Coffman

Performance issue in ROMIO collective buffer aggregation for Parallel HDF5 on sunspot

I think the reality is this messaging performance is what it is and it really should be up to mpi tests like the osu microbenchmarks to identify bottlenecks in the...

Performance issue in ROMIO collective buffer aggregation for Parallel HDF5 on sunspot

This issue can be closed.

romio: daos: performance problem with long list of tiny requests list-io

I think you covered all the angles Rob

romio: daos: performance problem with long list of tiny requests list-io

Thanks for the suggestion @wkliao for the lammps pnetcdf case we are only running 12ppn, the data is 3d composition rank ordered so we would get some benefit from the...

osu_alltoall very slow at 128 nodes 96 ppn

@roblatham00 has been investigating MPI-IO collective aggregation performance as well, as have I, this is probably related. Is this only for high ppn or are you seeing this at say...

osu_alltoall very slow at 128 nodes 96 ppn

I took a look, taking the 4k message size and with the progress throttle on 512 nodes latency goes from 25 ms for 12 ppn to 327 ms for 96...

osu_alltoall very slow at 128 nodes 96 ppn

I have a version of IOR that when running against DAOS in MPI-IO mode with collective buffering disabled it writes discontiguous data all over the file, essentially every rank writes...

osu_alltoall very slow at 128 nodes 96 ppn

A couple more details at the messaging level on my enhanced IOR slowdown. So we are using erasure encoding in DAOS with 128k cells, so in the regular contiguous block...

osu_alltoall very slow at 128 nodes 96 ppn

Actually I was off by a factor of 2 on the enhanced IOR RMA gets, for 16ppn the DAOS server does 4 32K RMA gets from 4 clients, for 64ppn...

osu_alltoall very slow at 128 nodes 96 ppn

One more data point, with collective buffering ON for my enhanced ior on the read where each collective buffer distributes data to all the ranks there is a 100x slowdown...