pramen issues

Add the ability to have metastore partitioning column as a string, not a date

## Background Currently, information date format and type are ignored when the metastore persistence format is 'delta'. For example, here the date format will be ignored: ```hocon pramen.metastore.tables = [...

yruslan

enhancement

Pramen-Scala

DS

Do not run `count()` before running a transformation

## Background Running `count()` on a big transformation is too expensive. It can be avoided but just always executing the transformation, and then reading the record count from the metstore....

yruslan

enhancement

Pramen-Scala

DS

#374 Incremental Ingestion

2

Closes #374 Closes #421 This PR adds 'incremental' as a schedule type, and mechanisms for managing offsets (experimental). Pramen `version 1.10` introduces the concept of incremental ingestion. It allows running...

yruslan

Allow metastore tables having `delta` format not be partitioned

## Background Partitioning of Delta Lake tables might actually worsen the efficiency of reads, especially for small tables. https://delta.io/blog/pros-cons-hive-style-partionining/ https://delta.io/blog/2023-06-03-delta-lake-z-order/ This feature is about adding a flag to make metastore...

yruslan

enhancement

Do not calculate record count for non-cached transient jobs

## Background Calculating record count for non-cached transient jobs effectively doubles the calculation time. ## Feature Do not calculate record count for non-cached transient jobs. ## Example [Optional] A simple...

yruslan

enhancement

Make pipeline statuses more strict in notifications

## Background Currently, there are 4 pipeline notification statuses: 1. Failed (no successful tasks or a fatal error) 2. Partial Success (some tasks succeeded, some failed) 3. Succeeded with warnings...

yruslan

enhancement

Add a way to route different issues to different email lists

## Background Currently, Pramen supports only 2 email lists - one for successes, and one for failures. Sometimes, depending on an error, for example, emails can be routed to different...

yruslan

enhancement

Pramen-Scala

DE

pramen
pramen copied to clipboard

Metadata

Add the ability to have metastore partitioning column as a string, not a date

Do not run `count()` before running a transformation

#374 Incremental Ingestion

Allow metastore tables having `delta` format not be partitioned

Do not calculate record count for non-cached transient jobs

Make pipeline statuses more strict in notifications

Add a way to route different issues to different email lists

← Metadata

Owner

Metadata

pramen pramen copied to clipboard

Metadata

Add the ability to have metastore partitioning column as a string, not a date

Do not run `count()` before running a transformation

#374 Incremental Ingestion

Allow metastore tables having `delta` format not be partitioned

Do not calculate record count for non-cached transient jobs

Make pipeline statuses more strict in notifications

Add a way to route different issues to different email lists

← Metadata

Owner

Metadata

pramen
pramen copied to clipboard