snowplow-rdb-loader icon indicating copy to clipboard operation
snowplow-rdb-loader copied to clipboard

RDB Loader: make consistency check folder-based

Open chuwy opened this issue 7 years ago • 1 comments

Currently we're checking consistency by comparing list of files between checks, but (probably) it is too strict check, because in the end we're loading data using pattern s3://shredded/good/com.acme/shredded-context/jsonschema/1-0-0/part-*, which means we don't care if particular files are available, but we only need to know that folders are still exist.

Implement in conjunction with https://github.com/snowplow/snowplow-rdb-loader/issues/68

chuwy avatar Dec 19 '17 05:12 chuwy

[3:50 PM] Alexander Dean: On
[3:50 PM] Alexander Dean: https://github.com/snowplow/snowplow-rdb-loader/issues/74
[3:51 PM] Alexander Dean: I am not against this, but I read this as reducing the accuracy of the consistency check
[3:51 PM] Alexander Dean: whereas the description implies it makes no difference
[3:52 PM] Anton Parkhomenko: Hm, probably you're right. We really care about files for accuracy
[3:52 PM] Alexander Dean: Let's push it back

alexanderdean avatar Jan 08 '18 15:01 alexanderdean