s3fs icon indicating copy to clipboard operation
s3fs copied to clipboard

S3 consistency guarantees

Open jseabold opened this issue 9 years ago • 8 comments

I was perusing the code a little more, and I wanted to bring this up, because people will run into it. In the S3Map class, you provide a __setitem__. S3 only has eventual consistency guarantees on overwrites. Anecdotally, I've heard that this can be up to 48 hours. If people start to try to use this as a true, consistent key-value store in a production setting, they're going to have a bad time. I suppose removing this is going to be out of the question, so I would at least provide a very visible warning somewhere.

Q: What data consistency model does Amazon S3 employ?

Amazon S3 buckets in all Regions provide read-after-write consistency for PUTS of new objects and eventual consistency for overwrite PUTS and DELETES.

https://aws.amazon.com/s3/faqs/

jseabold avatar May 23 '16 14:05 jseabold

Thanks for the heads up.

martindurant avatar May 23 '16 17:05 martindurant

My personal use case (storing large ndarrays) is write-once-read-many, so this doesn't bother me-as-a-user. However, I recognize that this can introduce super-subtle and damaging bugs. In the short term I'm tempted to resolve it with loud warnings, both in documentation and in docstrings. I'm not sure how to handle this long-term.

mrocklin avatar May 25 '16 13:05 mrocklin

fwiw, testing around with overwiting a mapping key seems to update immediately. Of course, S3 could be doing fancy caching for my user; or maybe the consistency guarantee refers to access from different regions.

martindurant avatar May 25 '16 14:05 martindurant

Yeah performance is going to vary, and I don't think you can plan for it, but I would take them at their word. Regions may matter but if you search for the issue you'll find that a lot of people discovered this the hard way down the line in very subtle bugs, particularly in the fake a filesystem for hadoop use case.

My suggestion would be to avoid the temptation of Python dunder magics for this and to use verbs for things like put and post that have different code paths, so users are forced to confront it. Presenting it as a consistent key-value store like a dictionary is going to bite people.

On Wed, May 25, 2016, 9:39 AM Martin Durant [email protected] wrote:

fwiw, testing around with overwiting a mapping key seems to update immediately. Of course, S3 could be doing fancy caching for my user; or maybe the consistency guarantee refers to access from different regions.

— You are receiving this because you authored the thread. Reply to this email directly or view it on GitHub https://github.com/dask/s3fs/issues/49#issuecomment-221597217

jseabold avatar May 25 '16 15:05 jseabold

I suppose the lack of consistency only manifests itself in the face of a partition given the CAP theorem, so you really can't plan for it and you won't know unless you implement an external system to provide something approaching consistency in the face of a super, highly available service, but this is a hard and big problem.

On Wed, May 25, 2016, 10:44 AM Skipper Seabold [email protected] wrote:

Yeah performance is going to vary, and I don't think you can plan for it, but I would take them at their word. Regions may matter but if you search for the issue you'll find that a lot of people discovered this the hard way down the line in very subtle bugs, particularly in the fake a filesystem for hadoop use case.

My suggestion would be to avoid the temptation of Python dunder magics for this and to use verbs for things like put and post that have different code paths, so users are forced to confront it. Presenting it has a consistent key-value store like a dictionary is going to bite people.

On Wed, May 25, 2016, 9:39 AM Martin Durant [email protected] wrote:

fwiw, testing around with overwiting a mapping key seems to update immediately. Of course, S3 could be doing fancy caching for my user; or maybe the consistency guarantee refers to access from different regions.

— You are receiving this because you authored the thread. Reply to this email directly or view it on GitHub https://github.com/dask/s3fs/issues/49#issuecomment-221597217

jseabold avatar May 25 '16 15:05 jseabold

Semi-relevant https://blogs.aws.amazon.com/bigdata/post/Tx1WL4KR7SE37YY/Ensuring-Consistency-When-Using-Amazon-S3-and-Amazon-Elastic-MapReduce-for-ETL-W

Amazon uses DynamoDB as the external system but it itself is eventual consistency AFIAK :)

jseabold avatar May 25 '16 18:05 jseabold

https://github.com/Netflix/s3mper

jseabold avatar May 25 '16 18:05 jseabold

Maybe there could be an option to raise an exception if one tries to set a key that already exists? If there were such an option, having it on by default would be good, I'd think, with a warning in the docstring.

mheilman avatar Jun 14 '16 21:06 mheilman