Anton Parkhomenko
Anton Parkhomenko
If user saves the data at the root of the bucket, like: ``` s3://snowplow-bucket/run=2022-01-28-16-30-00/ ``` Instead of: ``` s3://snowplow-bucket/shredded/run=2022-01-28-16-30-00/ ``` Our algorithm fails to figure out the bucket is processed...
This is a follow-up for #608. In addition to just checking if loading is progressed via stages, we also should be able to leverage [`STV_LOAD_STATE`](https://docs.aws.amazon.com/redshift/latest/dg/r_STV_LOAD_STATE.html) docs claim it's available to...
Currently we use our own [`Logging` algebra](https://github.com/snowplow/snowplow-rdb-loader/blob/a2bc768803115b15649f9518fc59d30d1b0f50a4/modules/loader/src/main/scala/com/snowplowanalytics/snowplow/rdbloader/dsl/Logging.scala#L22), which allows us to test logging and abstract it away, however it's very unflexible. 1. It doesn't allow to disable logging for some...
In a case where a batch has two schemas that would get transformed into the same table, (e.g. `some-schema` and `some_schema`) we produce not-so-helpful error message: https://github.com/snowplow/snowplow-rdb-loader/blob/a2bc768803115b15649f9518fc59d30d1b0f50a4/modules/loader/src/main/scala/com/snowplowanalytics/snowplow/rdbloader/db/Migration.scala#L251 At very least...
We have tens of occurences of this error for different clients: ``` com.snowplowanalytics.snowplow.rdbloader.LoaderError$StorageTargetError: Database error: [Amazon](500310) Invalid operation: 1023 Details: Serializable isolation violation on table - 100167, transactions forming the...
We have a very common kind of an error caused by schemaing mistake that results in std load error. What we usually do is either: 1. Notifying the owner asking...
If Redshift is faulty state it would make sense to stop receiving messages. Otherwise we need to resend many messages that were acked, but failed being loaded.
As we have in Snowflake Loader (https://github.com/snowplow-incubator/snowplow-snowflake-loader/blob/master/loader/src/main/resources/sql/atomic-def.sql), but come up with a more discoverable location.
In #232 we moved entirely to SQS discovery, but left some functionality related to discovering data on S3, mostly in `ShreddedType` modeul. I think that it will be helpful later...