open-grid-emissions
open-grid-emissions copied to clipboard
Add `--clobber` option in data pipeline
Certain functions in data_pipeline
, specifically which download data from the internet or take a long time to run (like to gross to net generation calculations), are implemented such that they will not be re-run if the data already exists in the directory. However, in certain cases (new source data is released or need to re-generate GTN calculations) currently the user would have to manually delete these directories. Instead, we should include an option like --clobber
in these functions, which if used, would overwrite the existing data even if it already exists. Could also make this a command line argument if it’s a common use case
Also probably would be a good idea to clobber result folders by default to avoid accidentally assuming outdated result files are results from a new run