oso
oso copied to clipboard
Handle type mismatch conflicts better with `dlt`
Right now, runs will fail on a type mismatch. If the API is flaky and schemas change, it should be handled by dlt instead of breaking the pipeline.
Some extra thoughts here: https://github.com/opensource-observer/oso/issues/5186#issuecomment-3369178423
Seems like the API schema changed greatly in between runs. Some fields do not have data anymore
2025-10-28 15:12:06 +0100 - dagster - INFO - __ASSET_JOB - 7155b5f1-dc6b-4d12-8156-905c70efcb94 - giveth__qf_rounds - GraphQLFactory: Completed fetching 17 total items across 1 successful pages
2025-10-28 15:12:07,418|[WARNING]|81169|8582907904|dlt|logger.py|wrapper:40|In schema `qf_rounds`: The following columns in table 'qf_rounds' did not receive any data during this load and therefore could not have their types inferred:
- giving_blocks_id
- change_id
- youtube
- co_ordinates
- project_qf_round_relations
- stripe_account_id
- donations
- reactions
- social_media
- anchor_contracts
- status_history
- project_verification_form
- featured_update
- project_future_power
- project_instant_power
- verification_form_status
- social_profiles
- project_estimated_matching_view
- project_url
- prev_status_id
- project_update
- project_updates
- admin_js_base_url
- reaction
- campaigns
- cause_projects
- deposit_tx_chain_id
- chain_id
Unless type hints are provided, these columns will not be materialized in the destination.
One way to provide type hints is to use the 'columns' argument in the '@dlt.resource' decorator. For example:
@dlt.resource(columns={'giving_blocks_id': {'data_type': 'text'}})
2025-10-28 15:12:07 +0100 - dagster.daemon.QueuedRunCoordinatorDaemon - INFO - 1 runs are currently in progress. Maximum is 1, won't launch more.
2025-10-28 15:12:07 +0100 - dagster - DEBUG - __ASSET_JOB - 7155b5f1-dc6b-4d12-8156-905c70efcb94 - 81169 - giveth__qf_rounds - STEP_OUTPUT - Yielded output "result" of type "Any". (Type check passed).
2025-10-28 15:12:07 +0100 - dagster - DEBUG - __ASSET_JOB - 7155b5f1-dc6b-4d12-8156-905c70efcb94 - 81169 - giveth__qf_rounds - ASSET_MATERIALIZATION - Materialized value giveth qf_rounds.
2025-10-28 15:12:07 +0100 - dagster - DEBUG - __ASSET_JOB - 7155b5f1-dc6b-4d12-8156-905c70efcb94 - 81169 - giveth__qf_rounds - STEP_SUCCESS - Finished execution of step "giveth__qf_rounds" in 9m20s.