pypsa-usa icon indicating copy to clipboard operation
pypsa-usa copied to clipboard

Refactor / Split-Up `add_electricity`

Open trevorb1 opened this issue 1 year ago • 2 comments

Feature Request

The module description for add electricity reads the following, but this is only a small part of it

Adds electrical generators and existing hydro storage units to a base network
based on the given network configuration.

A lot of stuff is happening in add_electricity right now, and it is a little hard to track what is going on (and I have def contributed to this confusion!)

We are:

  • Adding generators from different sources
  • Adding demand from different sources
  • Adding costs from one source (build_cost_data) but then updating different costs (capex and variable or generators and transmission) from different sources
  • Adding emissions to carriers
  • a few other minor things (populating missing data, matching data points, cleaning data, ect...). Some of which directly relate to the other logic, and some which are more generalized and can be repurposed.

Clearly identifying what is happening in add_electricity and potentially breaking it up should be done, imo. But open to discussion!

Suggested Solution

I think we should be breaking each function (ie. add generators, add costs, add demand) into different modules, and maybe different rules as well (however, different rules may not be necessary). Moreover, standardizing data classes would be very nice, that way the subbing in and out of data would be much easier, as we are exposing different data sources for the user to implement.

For example, I really like the overall structure of the attach_demand(...) function (see below)! However, within the functions (ie. prepare_eia_demand, ect..) we are reading data, formatting data, performing some hard-coded transformations, filtering data, accepting different arguments, finding intersections, ect... It is just a lot to keep track of. Moreover, it is unclear if all these data sources are actually transferable (ie. if you aggregate on state vs. ba or other high-level decisions like that). They may be, which is great! Its just hard to tell.

While I am using demand as an example, I really want to empahasize that my code (for example, updating of marginal/capital costs) in add_electricity suffers from these exact same problems! So I am def not trying to just pick on a single piece of code. Hoping this just starts a discussion if others are experiencing the same thoughts as me?

def attach_demand(n: pypsa.Network, configuration: str):
    """
    Add demand to network from specified configuration setting.

    Returns network with demand added.
    """
    if configuration == "ads":
        demand_per_bus = prepare_ads_demand(n, f"data/WECC_ADS/processed/load_2032.csv")
    elif configuration == "pypsa-usa":
        if snakemake.params.get("planning_horizons"):
            demand_per_bus = prepare_efs_demand(
                n,
                snakemake.params.get("planning_horizons"),
            )
        else:
            demand_per_bus = prepare_eia_demand(n, snakemake.input["eia"][0])
    else:
        raise ValueError(
            "Invalid demand_type. Supported values are 'ads', and 'pypsa-usa'.",
        )
    n.madd(
        "Load",
        demand_per_bus.columns,
        bus=demand_per_bus.columns,
        p_set=demand_per_bus,
        carrier="AC",
    )

Additional Info

No response

trevorb1 avatar Feb 24 '24 23:02 trevorb1

I also get that this can be a task that takes a while without adding any tangible improvement to output. But it can help with troubleshooting!

trevorb1 avatar Feb 24 '24 23:02 trevorb1

I hear you that add_electricity is getting unwieldy....

We could create a new module for building power plant data: pretty much anything that gets assigned to generators like fuel costs, heat rates, ramping rates, and maybe capex could be moved to its own module. This means the decision between using ADS vs PyPSA-USA configuration would be made there. Output of this module would be the network.generators table and the network.generators_t.marginal_cost table

We could separately have a rule for build_demand_data to select between demand years.

ktehranchi avatar Feb 26 '24 04:02 ktehranchi