dlt icon indicating copy to clipboard operation
dlt copied to clipboard

feat/3155 history

Open ArneDePeuter opened this issue 3 months ago • 0 comments

Description

This PR introduces the history feature (see issue #3155), which allows transformers to access parent records directly when keep_history=True is set.
This removes the need to explicitly propagate parent fields through every child transformer, simplifying pipelines and reducing schema clutter.

Implementation

  • When a resource is defined with keep_history=True, a history dictionary is made available to all subsequent nodes.
  • Only nodes marked with keep_history=True contribute their data to this dictionary, ensuring that only the required information is retained.
  • If no node (or any of its parents) uses the history feature, all nodes receive a shared EMPTY_HISTORY object. This avoids additional allocations and keeps the overhead negligible, since only a single immutable reference is passed around.

The design follows a similar approach to the existing meta feature for consistency.
Comprehensive tests have been added to validate the behavior of this new feature.

Tiny Additional Proposal

As a follow-up improvement, we could simplify function signatures by removing the need for explicit = None defaults.
For example, instead of writing:

def transformer(item, meta=None, history=None):
    ...

we could allow:

def transformer(item, meta, history):
    ...

This would make transformer definitions cleaner and reduce boilerplate in function signatures.

ArneDePeuter avatar Oct 01 '25 13:10 ArneDePeuter