pypsa-eur icon indicating copy to clipboard operation
pypsa-eur copied to clipboard

Streamline workflow

Open FabianHofmann opened this issue 3 months ago • 3 comments

Closes #1916. (supersedes)

Streamlined and restructured workflow

This PR refactors the workflow in order to streamline and unify optimization approaches and configuration settings. here are some highlights of the PR, otherwise no one will have the motivation to look at this large change. I suggest before going into the code, read this here first, even though it is long...

Highlights

  • No more wildcards excepts for {horizon}; all moved explicit to config entries
  • No more cryptic filenames like elec_s_37_lv1.25_3H_2030.nc
  • Processing of network follows the logic of base.nc -> clustered.nc -> composed_{horizon}.nc -> solved_{horizon}.nc
  • Results are unified and made independent of horizons (csvs/nodal_costs.csv, csvs/capacities.csv ...)
  • compose rule is the entry point for green-field, brown-field, perfect foresight model for both sector-coupled and electricity-only.
  • Electricity-only models now run with myopic an perfect foresight
  • Make and plot summary now works for all combinations of models (elec/sector) and foresights

and now a more descriptive summary

Overview

  • The pr restructures the Snakemake workflow of PyPSA-Eur into a leaner pipeline (base → simplified → clustered → composed → solved) and standardises filenames so scenario information now lives in configuration instead of wildcards.
  • All rules that previously emitted *_s_{clusters} or ..._{planning_horizons} artefacts have been renamed; the new naming scheme makes horizons explicit via {horizon} and removes cluster suffixes.
  • Post-processing rules were refactored to read the new solved network filenames and now operate uniformly across all foresight modes, so the map and summary targets remain identical for overnight, myopic, and perfect runs for both elec-only and sector-coupled.

Workflow Changes

  1. Stage progression – The workflow now moves strictly through networks/base.ncnetworks/simplified.ncnetworks/clustered.ncnetworks/composed_{horizon}.ncRESULTS/networks/solved_{horizon}.nc. Old intermediate targets such as networks/base_s_{clusters}_{opts}_{sector_opts}_{planning_horizons}.nc no longer exist.
  2. Unified compositionrules/compose.smk encapsulates what used to be multiple prepare_*, add_*, and brownfield rules; it assembles everything needed for a given horizon and handles the myopic/perfect brownfield inputs.
  3. Single solve rulerules/solve.smk contains one solve_network rule; electricity-only and sector-coupled cases are distinguished by config, not by separate rule files (solve_* smks were deleted).
  4. Collection targetsrules/collect.smk now checks four milestones: clustered networks, composed networks, solved networks, and plotting; {horizon} replaces {planning_horizons} wildcards in file names.
  5. Scenario-aware confignavigate_config, get_full_config, and get_config (all in rules/common.smk) cache fully merged configs per wildcard set. Any new rules should obtain parameters through config_provider(...) or get_config(w) to stay scenario compatible.
  6. Warm-start logic – Myopic runs read the previous RESULTS/networks/solved_{prev}.nc, while perfect foresight reuses the prior networks/composed_{prev}.nc. Overnight runs must supply a single planning horizon.
  7. Post-process harmonizationrules/postprocess.smk now reads RESULTS/networks/solved_{horizon}.nc for all foresight modes, so every map (power_network_{horizon}.pdf, h2_network_{horizon}.pdf, {carrier}_balance_map_{horizon}.pdf, etc.) and CSV summary is generated with the same naming scheme regardless of foresight setting.

Stage Notes

  • Base & shapesrules/build_electricity.smk writes regions_onshore_base.geojson/regions_offshore_base.geojson together with networks/base.nc. Administrative shapes and OSM/TYNDP inputs remain unchanged.
  • Simplified assetssimplify_network now emits networks/simplified.nc, regions_onshore_simplified.geojson, regions_offshore_simplified.geojson, and busmap_simplified.csv. build_electricity_demand_base consumes those files and produces electricity_demand_simplified.nc. process_cost_data reads per-horizon costs_{horizon}.csv.
  • Clusteringcluster_network takes networks/simplified.nc plus busmap_simplified.csv and emits networks/clustered.nc, regions_onshore.geojson, regions_offshore.geojson, busmap.csv, and linemap.csv. Cluster counts are configured, not embedded in filenames.
  • Compose assets – Most intermediate files on inputs drop the _s_{clusters} suffix; ie. population layouts collapse to pop_layout.csv and pop_layout_simplified.csv; Gas locations become gas_input_locations.geojson/gas_input_locations_simplified.csv; Powerplant list becomes powerplants.csv etc.
  • Compositioncompose_network (one rule) flattens all previous add_existing_baseyear, add_brownfield, prepare_perfect_*, etc. It automatically wires the previous-horizon inputs depending on foresight and provides the final pre-solve network (networks/composed_{horizon}.nc).
  • Solvingsolve_network reads networks/composed_{horizon}.nc, writes RESULTS/networks/solved_{horizon}.nc, and stores solver/memory/python logs under RESULTS/logs/solve_network/. Custom extra functionality and solver settings are driven purely by config.
  • Post-processing – Preview maps are maps/base_network.pdf and maps/clustered_network.pdf; solved outputs are RESULTS/maps/power_network_{horizon}.pdf, .../h2_network_{horizon}.pdf, .../ch4_network_{horizon}.pdf, and .../{carrier}_balance_map_{horizon}.pdf. make_summary and plot_summary operate on RESULTS/networks/solved_{horizon}.nc regardless of foresight.

Config Changes

  • Planning horizons are now top-level – Set planning_horizons directly under the root of your config (config/config.default.yaml:34-35). The old scenario.planning_horizons entry is ignored, and the workflow expects config["planning_horizons"] to exist even when scenarios are disabled. Configuration bundles such as config/test/config.scenarios.yaml:17-20 already follow this format.
  • Global scenario block removed – The legacy scenario: section (with clusters, opts, sector_opts, etc.) is no longer part of config.default.yaml. Scenario sweeps should now be described via run.scenarios plus the dedicated scenario YAML file; individual dimensions (e.g. clusters) are configured directly under their respective sections. If you keep a scenario block in a local config it will simply be ignored.
  • CO₂ budget fields were restructuredco2_budget now specifies an emissions_scope, a values interpretation flag, and nested upper/lower dictionaries with their own enable switches (config/config.default.yaml:90-113). Update custom configs accordingly if you previously listed plain year-value pairs.
  • Transmission capacity caps gained explicit “extension” keyslines uses s_nom_max/s_nom_max_extension, and links uses p_nom_max/p_nom_max_extension (config/config.default.yaml:320-355). Rename any overrides that still refer to max_extension so the new limits are applied.
  • Selective carrier exclusion moved into configurationelectricity.exclude_carriers (config/config.default.yaml:115-162) lets you strip carriers during clustering; custom busmap logic should now read that list instead of hard-coding exclusions.
  • Existing capacity toggles are explicit – The brownfield logic key is existing_capacities.enabled (config/config.default.yaml:479-486). Set it to true when you want compose_network to import historical assets; otherwise the new workflow assumes a greenfield build.

Script Changes

  • scripts/compose_network.py combines function calls from multiple preparatory scripts (add_electricity.py, add_existing_baseyear.py, add_brownfield.py, prepare_network.py, prepare_sector_network.py, prepare_perfect_foresight.py). At a later stage a clearer packaging structure should be used here.
  • Preparatory scripts are no longer executed – there main sections was integrated into compose_network.py.
  • Solving entry points are harmonizedscripts/solve_network.py is the only solver script called from Snakemake; solve_operations_network.py (operations-only runs) is no longer referenced by the workflow.
  • Summary generation moved into scripts/make_summary.py – the script now loads all horizons via pypsa.NetworkCollection for overnight, myopic, and perfect runs, so the dedicated helpers scripts/make_summary_perfect.py and scripts/make_global_summary.py were deleted. Any custom tooling should invoke make_summary.py and read the unified CSV outputs.

File Name Mapping

Old target New target Notes
networks/base_s.nc networks/simplified.nc Paired with regions_onshore_simplified.geojson, regions_offshore_simplified.geojson, and busmap_simplified.csv.
regions_onshore_base_s_{clusters}.geojson / regions_offshore_base_s_{clusters}.geojson regions_onshore.geojson / regions_offshore.geojson Produced by cluster_network; no {clusters} wildcard in filename.
busmap_base_s_{clusters}.csv / linemap_base_s_{clusters}.csv busmap.csv / linemap.csv Same rule; clusters now implicit in config.
powerplants_s_{clusters}.csv powerplants_s.csv Generated after clustering; uses networks/clustered.nc.
electricity_demand_base_s.nc electricity_demand_simplified.nc Created from simplified assets.
availability_matrix_{clusters}_{technology}.nc availability_matrix_{technology}.nc Includes MD/UA variants.
profile_{clusters}_{technology}.nc profile_{technology}.nc regions_by_class_* follow the same pattern.
pop_layout_base_s_{clusters}.csv / pop_layout_base_s.csv pop_layout.csv / pop_layout_simplified.csv Solar rooftop potentials drop cluster suffixes too.
gas_input_locations_s_{clusters}.geojson gas_input_locations.geojson Simplified CSV also loses the suffix.
costs_{planning_horizons}.csv / _processed.csv costs_{horizon}.csv / _processed.csv collect.smk expands over {horizon}.
networks/base_s_{clusters}_{opts}_{sector_opts}_{planning_horizons}.nc (+ _brownfield*) networks/composed_{horizon}.nc Handles all foresight modes.
RESULTS/networks/base_s_{clusters}_{opts}_{sector_opts}_{planning_horizons}.nc RESULTS/networks/solved_{horizon}.nc Solver logs renamed accordingly.
maps/power-network.pdf / maps/power-network-s-{clusters}.pdf maps/base_network.pdf / maps/clustered_network.pdf Cluster preview plotting renamed.
RESULTS/maps/base_s_{clusters}_{opts}_{sector_opts}-costs-all_{planning_horizons}.pdf RESULTS/maps/power_network_{horizon}.pdf Same pattern for hydrogen, methane, and balance maps.

Migration Checklist

  1. Update rule dependencies – Replace any references to *_base_s* or *_base_s_{clusters}_* targets with the new filenames; consume networks/composed_{horizon}.nc/RESULTS/networks/solved_{horizon}.nc instead of the legacy base_s_* artefacts.
  2. Add custom inputs and outputs from prepare_ and add_ snakemake rules to the compose rule in compose.smk and update to the new wildcard convention (only keep {horizon} wildcard)
  3. Apply custom implementations in legacy scripts (add_.py/prepare_.py) to the corresponding sections in compose_network.py. Make sure to import necessary functions from the legacy script. The section in compose_network.py reference the old scripts to clearly indicate where to add the changes.
  4. Adjust custom scripts – Update plotting, analysis, or data-export scripts to read the {horizon}-based filenames and the new CSV outputs from make_summary.
  5. Warm-start expectations – Myopic extensions should request RESULTS/networks/solved_{prev}.nc; perfect foresight helpers should import the previous networks/composed_{prev}.nc. Do not craft filenames manually.

Checklist

  • [ ] I tested my contribution locally and it works as intended.
  • [ ] Code and workflow changes are sufficiently documented.
  • [ ] Changed dependencies are added to envs/environment.yaml.
  • [ ] Changes in configuration options are added in config/config.default.yaml.
  • [ ] Changes in configuration options are documented in doc/configtables/*.csv.
  • [ ] Sources of newly added data are documented in doc/data_sources.rst.
  • [ ] A release note doc/release_notes.rst is added.

FabianHofmann avatar Sep 22 '25 09:09 FabianHofmann

Thanks for starting this. Had only a short look so far. I guess it is too early for reviewing. How do you think we can best support?! I have opinions about many of the "critical open questions".

coroa avatar Sep 22 '25 09:09 coroa

You are totally right, it is too early for reviewing. I guess at this stage, it is best to think about whether I can proceed with this approach or if there is potential red flags, which speak against this strategy in the first place. But perhaps I spend more time on refining this and make a proper report for you guys what are this wider implications.

FabianHofmann avatar Sep 22 '25 10:09 FabianHofmann

I have opinions about many of the "critical open questions".

I must admit that the section Critical Unresolved Questions in the streamline document might be outdated and partially resolved (and perhaps even incomplete), so don't give it too much attention. but I would still be happy to hear any high level thoughts and critical things that I potentially miss!

FabianHofmann avatar Sep 22 '25 10:09 FabianHofmann