Inquiry About Truncate & Load Support in DLT-META Framework
Hi,
I have a use case where I need to perform truncate and load operations on some small-sized data sources. I am aware that the apply_changes_from_snapshot API can be used to implement change data capture (CDC).
I know that in a standard DLT pipeline using the Classic Core edition, it is possible to achieve truncate and load functionality. However with this framework, I could not find any references to this capability in the documentation or demos of the DLT-META framework.
Currently, I am using apply_changes_from_snapshot, but I am exploring a more cost-efficient approach. Specifically, if we have a use case that only requires truncate and load (without needing Classic Pro edition features), I would like guidance on how to implement this within the DLT-META framework.
Regards, Shishupal
@shishupalgeek
DLT relies entirely on Structured Streaming internally (as you can see in the readers implementation). Because of this, it doesn’t support the traditional overwrite behavior available in standard Spark batch APIs.
The only available approach to achieve similar functionality within DLT is to use the apply_changes_from_snapshot API with Change Data Capture (CDC). This effectively emulates a truncate-and-load pattern while remaining compatible with the streaming architecture DLT is built on.