ocfl-py icon indicating copy to clipboard operation
ocfl-py copied to clipboard

Support AWS S3 access

Open zimeon opened this issue 5 years ago • 3 comments

See pyfilesystem2 branch for work to change over to use PyFilesystem for all file access. This should enable the code to work with regular OS filesystems, S3 and Zipped filesystems among others.

zimeon avatar Jul 21 '20 02:07 zimeon

Have a version running that passes all tests with regular OS filesystem access and using Zipped sets of files (useful for test fixtures with empty dirs that git doesn't support).

Found a gotcha with S3 support via S3FS in that it assumes that there are "directory objects" to help simulate a filesystem (noted in https://fs-s3fs.readthedocs.io/en/latest/#limitations). However, it would be good to be able to validate OCFL objects and storage roots on S3 that do not include ""directory objects". Solution may be to create the S3FS object with strict=False but that doesn't work with the standard open_fs(...) filesystem opener, see https://github.com/PyFilesystem/s3fs/issues/65#issuecomment-661573098

zimeon avatar Jul 21 '20 02:07 zimeon

Maybe it is possible to pass the strict=False into the generic opener by adding a query parameter strict=0 which is parsed and then used in the S3FS version of open_fs https://github.com/PyFilesystem/s3fs/blob/master/fs_s3fs/opener.py#L23-L27

zimeon avatar Jul 23 '20 20:07 zimeon

Have merged in pyfilesystem2 branch as version 1.1.0. Needs some more work to tidy the pyfs code, especially the new version of walk that avoids using scandir in ocfl/pyfs

zimeon avatar Aug 03 '20 15:08 zimeon