mpich
mpich copied to clipboard
bug: shared file pointer implementation should be robustified
Originally by robl on 2014-12-19 16:02:18 -0600
Paul Coffman reported a race condition where rank 0 might open the shared file pointer, update it, close the file, delete, and unlink it all in the time that rank 1 is trying to open the file.
The right way to fix this is to make the implementation of shared file pointers retry/reopen if the file does not exist.
The shortest fix is to barrier in MPI_FILE_CLOSE. yuck.
Shouldn't MPI_File_open be a collective and thus okay for a barrier?