snowplow-rdb-loader icon indicating copy to clipboard operation
snowplow-rdb-loader copied to clipboard

RDB Loader: explore expired token

Open chuwy opened this issue 7 years ago • 3 comments

Recently we had a loader job running for 9 hours (ANALYZE most likely). After all steps successfully completed, RDB Loader tried to dump log to S3 and failed with following exception (from stderr):

RDB Loader successfully completed following steps: [Discover, Load, Analyze]
ERROR: Log-dumping failed: [com.amazonaws.services.s3.model.AmazonS3Exception: The provided token has expired. (Service: Amazon S3; Status Code: 400; Error Code: ExpiredToken; Request ID: E061D9A5AFF4238C), S3 Extended Request ID: VU58XYTJzR/L1+X/7VdQxkPF+AxcmwtDPoCAByhPu7tbny0sIUSZs2ZJaC6eAsapCVSZT+Kfh50=]

I think problem is that we're using same S3 client for Discovery and Log dumping (first and last workflow steps respectively), so this client is created during app initialization. After several hours token expires and original client fails.

I couldn't find what exactly duration for AmazonS3ClientBuilder.standard() is, but assuming it should be around 2 hours. Possible solutions:

  1. Create separate client whenever log-dumping happens
  2. Re-create client only when this specific exception is thrown (leaning towards this one)

chuwy avatar Feb 05 '18 07:02 chuwy

Ah good sleuthing. Why not just re-initialize the client between each long-running task?

alexanderdean avatar Feb 05 '18 09:02 alexanderdean

Yeah, this is basically a first option (as Discovery and "LogDump" are just only two steps using S3). My reasoning on preferring second option is that getting credentials is sliiiiightly more failure-prone operation and I'd like to avoid it when it is not necessary. But in the end difference between implementations is very subtle, I don't think it really matters. Especially that with new manifests less jobs will take longer time and whole discovery step is moved to DynamoDB.

chuwy avatar Feb 05 '18 09:02 chuwy

Okay cool - up to you @chuwy

alexanderdean avatar Feb 05 '18 09:02 alexanderdean