snowplow-rdb-loader
snowplow-rdb-loader copied to clipboard
Stores Snowplow enriched events in Redshift, Snowflake and Databricks
In most places snowplow apps let the aws sdk figure out the region using the default chain. But AWS streaming transformer uses hadoop for writing to S3 (only for parquet...
The transformer client is not able to decode server response containing multiple schema versions. ```json { "schema": "iglu:com.snowplowanalytics.snowplow.badrows/loader_iglu_error/jsonschema/2-0-0", "data": { "processor": { "artifact": "snowplow-transformer-kinesis", "version": "5.4.0" }, "failure": [ {...
Currently for redshift and snowflake the `domain_sessionid` column is loaded as a `char(128)` when most other columns are a `varchar(128)`. In snowflake this doesn't actually matter as snowflake does not...
I just did an update of transformer-kinesis and databricks-loader from version 5.3.0 to version 5.7.0. Before and after the upgrade I ran a stress test using Taurus. After both test...
This PR contains automated tests for Snowflake Loader on Azure. It brings necessary building blocks to add tests for other destinations and cloud types as well. Test class structures are...
We encountered a case where batch transformer 5.4.1 didn't write shredding-complete.json file but didn't log any warning/error/alert
Logging sql statements, along with some other metadata, at debug level could help us understand production behavior better. A possible alternative is [p6spy](https://github.com/p6spy/p6spy), a plug & play solution. An example...
If we have a schema with a property that is an enum, eg: ``` "name": { "type": "string", "enum": [ "abc", "def", "ghi", "jkl" ], "maxLength": 256 } ``` currently...
This issue is about schema evolutions which add new columns. There is a problem that arises when the data is transformed using the older schema, but attempted to load using...