Add support for custom SQL flexibility in backfill mode
Based on the Sling documentation and GitHub discussions, I couldn't find a way to perform backfill replication using customSQL mode with the same flexibility as the {incremental_value} placeholder in incremental mode.
Source: slingdocs
From what I’ve seen, backfill mode only supports {incremental_where_cond}, which is automatically constructed as:
WHERE {update_key} >= {start_value} AND {update_key} <= {end_value}
Source: #41
Based on that statement, I tried to write the condition manually inside the custom SQL query like this:
defaults:
object: '{ENV_SCHEMA}_dl.raw_{stream_table}'
mode: backfill
primary_key: [id]
update_key: created_time
source_options:
range: '2024-11-10, 2024-11-11'
env:
SLING_LOADED_AT_COLUMN: timestamp
streams:
schema.table:
update_key: modified_time
sql: |
SELECT
*
,COALESCE(last_info_time, created_time) AS modified_time
FROM {stream_schema}.{stream_table}
WHERE COALESCE(last_info_time, created_time) >= {start_value}
AND COALESCE(last_info_time, created_time) <= {end_value}
ORDER BY COALESCE(last_info_time, created_time)
But this didn’t work as expected.
Would it be possible to support this as a new feature? I believe it would be helpful to have more flexibility when using backfill mode, especially when dealing with multiple {update_key} values inside custom SQL.
Thanks! #215 #41
Hey @royandhika, have you tried a CTE?
with source as(
SELECT
*, COALESCE(last_info_time, created_time) AS modified_time
FROM {stream_schema}.{stream_table}
)
select * from source
WHERE {incremental_where_cond}
ORDER BY modified_time
Hey @royandhika, have you tried a CTE?
with source as( SELECT *, COALESCE(last_info_time, created_time) AS modified_time FROM {stream_schema}.{stream_table} ) select * from source WHERE {incremental_where_cond} ORDER BY modified_time
Wow, yeah, never really thought of that! I tried it and it actually works!
Still, I think having able to custom {start_value} and {end_value} doesn’t seem like a bad idea at all. Thanks for the CTE tip!
{start_value} and {end_value} implemented a while back. Closing