blueshift icon indicating copy to clipboard operation
blueshift copied to clipboard

s3 bucket eventual consistency errors

Open ryanmmmmm opened this issue 9 years ago • 5 comments

so over the weekend s3 eventual consistency on list operations because really slow and it seems the s3 list index was retuning files that were actually deleted, therefore i was getting COPY errors.

So i think ill need to run through the list and do a HEAD GET for each file to make sure its 100% there before adding to the manifest correct?

select recordtime,query, sliceid, substring(key from 1 for 20) as file, substring(error from 1 for 300) as error,bucket,key from stl_s3client_error

2015-04-27 20:12:18.131621 | 1498948 | 31 | staging/y=2015_m=4_d | s3-us-west-2.amazonaws.com 54.231.161.48 S3ServiceException:The specified key does not exist.,Status 404,Error NoSuchKey,Rid C61A58E48CCA899D,ExtRid p5mfdOb0otnryj7yxur/iM/TS9RI+DfjqA58WL5GSlMZhEJsPJENIvHDRUFfCEkEA7VkQ7frq9U=,CanRetry 1,Abort: 5 5 5. | datasnap-events-prod-flattened-redshift-04-13-2015 | staging/y=2015_m=4_d=27_H=18_M=15_80deb2ee-2bc9-4c44-9a4f-b5f4b1e07e87.gz

ryanmmmmm avatar Apr 27 '15 20:04 ryanmmmmm

amazon also recommends keeping an index of the current files in a dynamodb index...

said it was the basis for EMRFS for mapreduce to get around the eventual consistency list problem.

ryanmmmmm avatar Apr 27 '15 20:04 ryanmmmmm

Looking at these libraries now:

https://github.com/Netflix/s3mper http://techblog.netflix.com/2014/01/s3mper-consistency-in-cloud.html

i think the major thing with our use cases causing these issues is that we are putting dat into that bucket as we get it.. not in batch way were we copy the last hours worht of data into that bucket for blueshuft or something to give eventual consistency time to catch up..

ryanmmmmm avatar Apr 27 '15 20:04 ryanmmmmm

Hi @ryanmedlin

Thanks for raising this - S3 is an interesting storage engine indeed ;-)

Correct me if I'm wrong, but what I understand is that you have failing operations caused by files being present while the manifest is generated but gone when the actual SQL INSERT statement is executed?

Since the old manifest file will be deleted and a new one will be generated before retrying this means that (one of) the next loads will succeed, once consistency catches up with the deleted tsv files.

My (current) opinion is that this is a simple way of handling the inconsistency error. An unacceptable error would be objects being listed that are not yet available as this could lead to loads never being able to complete. But the current error-mode guarantees that loads will eventually succeed which is a good thing. I think adding a dependency on a consistency system would add unneeded complexity to the system.

I am open to discussions :-) Can you confirm your load eventually succeeded?

tgk avatar Apr 28 '15 08:04 tgk

The problem that happened this weekekend was this:

Blueshift successfully processed files in s3 in the manifest and removed them.

Then more files were added to the s3 bucket. Blueshift then ran a list command to get the list of files and created a new manifest file and tried to process it.

BUT this past weekend it seems like fo rour region/buckets the List index used by the S3 list API was not being updated for longer than just a few minutes so the list index still had the old files that were deleetd in the index.

Therefore the list command and subsequent manifest file had files that were actually not in the bucket anymore and hence the not found error occurred.

to get around this you could do a LIST, then do a HEAD get of each file to check its existence, then if it exists add it to the manifest file. i might need ot do this in a fork if we run into again again. Luckily for th emost part s3 is eventually consistent within seconds/minutes so we don't see it.

Architecturally u cant ever depend on the s3 list at all so I will need to keep my own index in dynamo on any put and deletes since puts and deletes are READ consitent.

that s3mper library is a good reference and also EMRSF of how they handle this.

Some more informational links:

http://www.stackdriver.com/eventual-consistency-really-eventual/

does the problem make sense now?

ryanmmmmm avatar Apr 29 '15 19:04 ryanmmmmm

i was thinking i was goign to add a config setting called : s3_consistency :strict

so that i could turn that additional validation on or off depending on my use case.. i might only turn it on if i get pages that we are getting COPY errors to get things back to normal so data is loaded within minutes not more than an hour due to errors/eventual consistency problems

ryanmmmmm avatar Apr 29 '15 19:04 ryanmmmmm