Streamline workflow
Closes #1916. (supersedes)
Streamlined and restructured workflow
This PR refactors the workflow in order to streamline and unify optimization approaches and configuration settings. here are some highlights of the PR, otherwise no one will have the motivation to look at this large change. I suggest before going into the code, read this here first, even though it is long...
Highlights
- No more wildcards excepts for {horizon}; all moved explicit to config entries
- No more cryptic filenames like elec_s_37_lv1.25_3H_2030.nc
- Processing of network follows the logic of
base.nc->clustered.nc->composed_{horizon}.nc->solved_{horizon}.nc - Results are unified and made independent of horizons (
csvs/nodal_costs.csv,csvs/capacities.csv...) composerule is the entry point for green-field, brown-field, perfect foresight model for both sector-coupled and electricity-only.- Electricity-only models now run with myopic an perfect foresight
- Make and plot summary now works for all combinations of models (elec/sector) and foresights
and now a more descriptive summary
Overview
- The pr restructures the Snakemake workflow of PyPSA-Eur into a leaner pipeline (
base → simplified → clustered → composed → solved) and standardises filenames so scenario information now lives in configuration instead of wildcards. - All rules that previously emitted
*_s_{clusters}or..._{planning_horizons}artefacts have been renamed; the new naming scheme makes horizons explicit via{horizon}and removes cluster suffixes. - Post-processing rules were refactored to read the new solved network filenames and now operate uniformly across all foresight modes, so the map and summary targets remain identical for overnight, myopic, and perfect runs for both elec-only and sector-coupled.
Workflow Changes
- Stage progression – The workflow now moves strictly through
networks/base.nc→networks/simplified.nc→networks/clustered.nc→networks/composed_{horizon}.nc→RESULTS/networks/solved_{horizon}.nc. Old intermediate targets such asnetworks/base_s_{clusters}_{opts}_{sector_opts}_{planning_horizons}.ncno longer exist. - Unified composition –
rules/compose.smkencapsulates what used to be multipleprepare_*,add_*, and brownfield rules; it assembles everything needed for a givenhorizonand handles the myopic/perfect brownfield inputs. - Single solve rule –
rules/solve.smkcontains onesolve_networkrule; electricity-only and sector-coupled cases are distinguished by config, not by separate rule files (solve_*smks were deleted). - Collection targets –
rules/collect.smknow checks four milestones: clustered networks, composed networks, solved networks, and plotting;{horizon}replaces{planning_horizons}wildcards in file names. - Scenario-aware config –
navigate_config,get_full_config, andget_config(all inrules/common.smk) cache fully merged configs per wildcard set. Any new rules should obtain parameters throughconfig_provider(...)orget_config(w)to stay scenario compatible. - Warm-start logic – Myopic runs read the previous
RESULTS/networks/solved_{prev}.nc, while perfect foresight reuses the priornetworks/composed_{prev}.nc. Overnight runs must supply a single planning horizon. - Post-process harmonization –
rules/postprocess.smknow readsRESULTS/networks/solved_{horizon}.ncfor all foresight modes, so every map (power_network_{horizon}.pdf,h2_network_{horizon}.pdf,{carrier}_balance_map_{horizon}.pdf, etc.) and CSV summary is generated with the same naming scheme regardless of foresight setting.
Stage Notes
- Base & shapes –
rules/build_electricity.smkwritesregions_onshore_base.geojson/regions_offshore_base.geojsontogether withnetworks/base.nc. Administrative shapes and OSM/TYNDP inputs remain unchanged. - Simplified assets –
simplify_networknow emitsnetworks/simplified.nc,regions_onshore_simplified.geojson,regions_offshore_simplified.geojson, andbusmap_simplified.csv.build_electricity_demand_baseconsumes those files and produceselectricity_demand_simplified.nc.process_cost_datareads per-horizoncosts_{horizon}.csv. - Clustering –
cluster_networktakesnetworks/simplified.ncplusbusmap_simplified.csvand emitsnetworks/clustered.nc,regions_onshore.geojson,regions_offshore.geojson,busmap.csv, andlinemap.csv. Cluster counts are configured, not embedded in filenames. - Compose assets – Most intermediate files on inputs drop the
_s_{clusters}suffix; ie. population layouts collapse topop_layout.csvandpop_layout_simplified.csv; Gas locations becomegas_input_locations.geojson/gas_input_locations_simplified.csv; Powerplant list becomespowerplants.csvetc. - Composition –
compose_network(one rule) flattens all previousadd_existing_baseyear,add_brownfield,prepare_perfect_*, etc. It automatically wires the previous-horizon inputs depending on foresight and provides the final pre-solve network (networks/composed_{horizon}.nc). - Solving –
solve_networkreadsnetworks/composed_{horizon}.nc, writesRESULTS/networks/solved_{horizon}.nc, and stores solver/memory/python logs underRESULTS/logs/solve_network/. Custom extra functionality and solver settings are driven purely by config. - Post-processing – Preview maps are
maps/base_network.pdfandmaps/clustered_network.pdf; solved outputs areRESULTS/maps/power_network_{horizon}.pdf,.../h2_network_{horizon}.pdf,.../ch4_network_{horizon}.pdf, and.../{carrier}_balance_map_{horizon}.pdf.make_summaryandplot_summaryoperate onRESULTS/networks/solved_{horizon}.ncregardless of foresight.
Config Changes
- Planning horizons are now top-level – Set
planning_horizonsdirectly under the root of your config (config/config.default.yaml:34-35). The oldscenario.planning_horizonsentry is ignored, and the workflow expectsconfig["planning_horizons"]to exist even when scenarios are disabled. Configuration bundles such asconfig/test/config.scenarios.yaml:17-20already follow this format. - Global
scenarioblock removed – The legacyscenario:section (withclusters,opts,sector_opts, etc.) is no longer part ofconfig.default.yaml. Scenario sweeps should now be described viarun.scenariosplus the dedicated scenario YAML file; individual dimensions (e.g. clusters) are configured directly under their respective sections. If you keep ascenarioblock in a local config it will simply be ignored. - CO₂ budget fields were restructured –
co2_budgetnow specifies anemissions_scope, avaluesinterpretation flag, and nestedupper/lowerdictionaries with their ownenableswitches (config/config.default.yaml:90-113). Update custom configs accordingly if you previously listed plain year-value pairs. - Transmission capacity caps gained explicit “extension” keys –
linesusess_nom_max/s_nom_max_extension, andlinksusesp_nom_max/p_nom_max_extension(config/config.default.yaml:320-355). Rename any overrides that still refer tomax_extensionso the new limits are applied. - Selective carrier exclusion moved into configuration –
electricity.exclude_carriers(config/config.default.yaml:115-162) lets you strip carriers during clustering; custom busmap logic should now read that list instead of hard-coding exclusions. - Existing capacity toggles are explicit – The brownfield logic key is
existing_capacities.enabled(config/config.default.yaml:479-486). Set it totruewhen you wantcompose_networkto import historical assets; otherwise the new workflow assumes a greenfield build.
Script Changes
scripts/compose_network.pycombines function calls from multiple preparatory scripts (add_electricity.py,add_existing_baseyear.py,add_brownfield.py,prepare_network.py,prepare_sector_network.py,prepare_perfect_foresight.py). At a later stage a clearer packaging structure should be used here.- Preparatory scripts are no longer executed – there main sections was integrated into
compose_network.py. - Solving entry points are harmonized –
scripts/solve_network.pyis the only solver script called from Snakemake;solve_operations_network.py(operations-only runs) is no longer referenced by the workflow. - Summary generation moved into
scripts/make_summary.py– the script now loads all horizons viapypsa.NetworkCollectionfor overnight, myopic, and perfect runs, so the dedicated helpersscripts/make_summary_perfect.pyandscripts/make_global_summary.pywere deleted. Any custom tooling should invokemake_summary.pyand read the unified CSV outputs.
File Name Mapping
| Old target | New target | Notes |
|---|---|---|
networks/base_s.nc |
networks/simplified.nc |
Paired with regions_onshore_simplified.geojson, regions_offshore_simplified.geojson, and busmap_simplified.csv. |
regions_onshore_base_s_{clusters}.geojson / regions_offshore_base_s_{clusters}.geojson |
regions_onshore.geojson / regions_offshore.geojson |
Produced by cluster_network; no {clusters} wildcard in filename. |
busmap_base_s_{clusters}.csv / linemap_base_s_{clusters}.csv |
busmap.csv / linemap.csv |
Same rule; clusters now implicit in config. |
powerplants_s_{clusters}.csv |
powerplants_s.csv |
Generated after clustering; uses networks/clustered.nc. |
electricity_demand_base_s.nc |
electricity_demand_simplified.nc |
Created from simplified assets. |
availability_matrix_{clusters}_{technology}.nc |
availability_matrix_{technology}.nc |
Includes MD/UA variants. |
profile_{clusters}_{technology}.nc |
profile_{technology}.nc |
regions_by_class_* follow the same pattern. |
pop_layout_base_s_{clusters}.csv / pop_layout_base_s.csv |
pop_layout.csv / pop_layout_simplified.csv |
Solar rooftop potentials drop cluster suffixes too. |
gas_input_locations_s_{clusters}.geojson |
gas_input_locations.geojson |
Simplified CSV also loses the suffix. |
costs_{planning_horizons}.csv / _processed.csv |
costs_{horizon}.csv / _processed.csv |
collect.smk expands over {horizon}. |
networks/base_s_{clusters}_{opts}_{sector_opts}_{planning_horizons}.nc (+ _brownfield*) |
networks/composed_{horizon}.nc |
Handles all foresight modes. |
RESULTS/networks/base_s_{clusters}_{opts}_{sector_opts}_{planning_horizons}.nc |
RESULTS/networks/solved_{horizon}.nc |
Solver logs renamed accordingly. |
maps/power-network.pdf / maps/power-network-s-{clusters}.pdf |
maps/base_network.pdf / maps/clustered_network.pdf |
Cluster preview plotting renamed. |
RESULTS/maps/base_s_{clusters}_{opts}_{sector_opts}-costs-all_{planning_horizons}.pdf |
RESULTS/maps/power_network_{horizon}.pdf |
Same pattern for hydrogen, methane, and balance maps. |
Migration Checklist
- Update rule dependencies – Replace any references to
*_base_s*or*_base_s_{clusters}_*targets with the new filenames; consumenetworks/composed_{horizon}.nc/RESULTS/networks/solved_{horizon}.ncinstead of the legacybase_s_*artefacts. - Add custom inputs and outputs from prepare_ and add_ snakemake rules to the
composerule incompose.smkand update to the new wildcard convention (only keep{horizon}wildcard) - Apply custom implementations in legacy scripts (add_.py/prepare_.py) to the corresponding sections in compose_network.py. Make sure to import necessary functions from the legacy script. The section in
compose_network.pyreference the old scripts to clearly indicate where to add the changes. - Adjust custom scripts – Update plotting, analysis, or data-export scripts to read the
{horizon}-based filenames and the new CSV outputs frommake_summary. - Warm-start expectations – Myopic extensions should request
RESULTS/networks/solved_{prev}.nc; perfect foresight helpers should import the previousnetworks/composed_{prev}.nc. Do not craft filenames manually.
Checklist
- [ ] I tested my contribution locally and it works as intended.
- [ ] Code and workflow changes are sufficiently documented.
- [ ] Changed dependencies are added to
envs/environment.yaml. - [ ] Changes in configuration options are added in
config/config.default.yaml. - [ ] Changes in configuration options are documented in
doc/configtables/*.csv. - [ ] Sources of newly added data are documented in
doc/data_sources.rst. - [ ] A release note
doc/release_notes.rstis added.
Thanks for starting this. Had only a short look so far. I guess it is too early for reviewing. How do you think we can best support?! I have opinions about many of the "critical open questions".
You are totally right, it is too early for reviewing. I guess at this stage, it is best to think about whether I can proceed with this approach or if there is potential red flags, which speak against this strategy in the first place. But perhaps I spend more time on refining this and make a proper report for you guys what are this wider implications.
I have opinions about many of the "critical open questions".
I must admit that the section Critical Unresolved Questions in the streamline document might be outdated and partially resolved (and perhaps even incomplete), so don't give it too much attention. but I would still be happy to hear any high level thoughts and critical things that I potentially miss!