pramen Use more effective record count when JDBC source is used with an SQL query

Use more effective record count when JDBC source is used with an SQL query

Open yruslan opened this issue 1 year ago • 0 comments

Background

This idea is reported by @filiphornak

Currently the record count is calculated this way if SQL expression (rather than table name) is used as an input to the JDBC source: https://github.com/AbsaOSS/pramen/blob/0f8040a8bad151eccf5a6ee3403b2ae9c6a24b9e/pramen/core/src/main/scala/za/co/absa/pramen/core/reader/TableReaderJdbc.scala#L129-L129

This is not always effective since Spark does not always can get the record count without fetching all records.

Feature

Use more effective record count when JDBC source is used with an SQL query.

Proposed Solution

SELECT COUNT(*) AS CNT FROM (${query})

Jan 17 '24 11:01 yruslan

pramen pramen copied to clipboard

Use more effective record count when JDBC source is used with an SQL query

Background

Feature

Proposed Solution

pramen
pramen copied to clipboard