pramen icon indicating copy to clipboard operation
pramen copied to clipboard

Use more effective record count when JDBC source is used with an SQL query

Open yruslan opened this issue 1 year ago • 0 comments

Background

This idea is reported by @filiphornak

Currently the record count is calculated this way if SQL expression (rather than table name) is used as an input to the JDBC source: https://github.com/AbsaOSS/pramen/blob/0f8040a8bad151eccf5a6ee3403b2ae9c6a24b9e/pramen/core/src/main/scala/za/co/absa/pramen/core/reader/TableReaderJdbc.scala#L129-L129

This is not always effective since Spark does not always can get the record count without fetching all records.

Feature

Use more effective record count when JDBC source is used with an SQL query.

Proposed Solution

SELECT COUNT(*) AS CNT FROM (${query})

yruslan avatar Jan 17 '24 11:01 yruslan