lemur
lemur copied to clipboard
S3 data mover to support checksum
I understand the S3 data mover does not currently support checksum for put/get operations. Is this on the roadmap?
Currently (at least when I originally wrote this a couple years ago), the checksums are calculated while the file is being read or written. This works because the local mover is copying the file sequentially, so the checksum is basically "free." However, the S3 library copies the files in chunks and not necessarily in order, so it would be difficult to calculate a checksum as the file is being copied. At the time I didn't see any way of doing it other than scanning the file again before/after the operation but didn't get a chance to add that.