data-infra icon indicating copy to clipboard operation
data-infra copied to clipboard

airflow: operator and dag/task to sync NTD data via DOT API

Open charlie-costanzo opened this issue 6 months ago • 0 comments

Description

As we look to ingest more NTD data tables and years, we need to modify the existing scrape_ntd.py to utilize the DOT API instead of scraping Excel files from URLs.

Epic: #3402, Encompassing ticket: #3401

This PR modifies the script as a new file (scrape_ntd_api.py) to ingest endpoints from a CSV upload, in preparation for making this an Airflow operator.

Resolves #3414

Type of change

  • [x] New feature

How has this been tested?

local runs, test buckets

Post-merge follow-ups

  • [x] No action required

charlie-costanzo avatar Aug 06 '24 16:08 charlie-costanzo