banyan-julia icon indicating copy to clipboard operation
banyan-julia copied to clipboard

S3FS takes time to sync up

Open calebwin opened this issue 3 years ago • 0 comments

Sometimes S3FS takes time to sync up files or file metadata. On one occassion with the Black Scholes tests, we would have HDF5.ishdf5 cause an error about the file that we were trying to write to being invalid because the file still existed according to S3FS. The file had recently been deleted by some code that ran on the client side.

On another occasion, we got a ERROR: LoadError: ArgumentError: "/home/ec2-user/s3fs/banyan-cluster-data-pumpkincluster02-f47c1c35/iris_large.csv" is not a valid file when trying to load a file that definitely got written (by some code on the client side).

This issue could affect usage of S3FS on the client side or on the cluster. There are several action items to take:

  • [ ] Ensure that we are not configuring S3FS to allow caching.
  • [ ] Use fsync to sync up files that are written to.

calebwin avatar Aug 13 '21 18:08 calebwin