ElectricityLCI icon indicating copy to clipboard operation
ElectricityLCI copied to clipboard

Missing eGRID subregion generation by fuel category reference data

Open dt-woods opened this issue 1 year ago • 6 comments

The electricity baseline provides a user-defined configuration value for 'egrid_year', which triggers the data file, '~/electricitylci/data/egrid_subregion_generation_by_fuelcategory_reference_[year].csv' to be accessed in 'egrid_energy.py' (referenced in 'generation_mix.py').

The ElectrictyLCI only provides two CSV files: 2014 and 2016. See electricitylci/data/egrid_subregion_generation_by_fuelcategory_reference_2016.csv.

This means that ELCI_2 configuration model is unsupported. This also means that future baselines are hindered by the lack of this data file.

In order to support the current development and future baselines, a little more transparency is needed regarding the following:

  • What is this reference data file?
  • Where does it come from?
  • How was it created?

dt-woods avatar Nov 20 '23 17:11 dt-woods

https://github.com/USEPA/ElectricityLCI/blob/master/electricitylci/data/egrid_subregion_generation_by_fuelcategory_reference_2016.csv

dt-woods avatar Nov 20 '23 17:11 dt-woods

~~Note that it does not appear that StEWI has the facility generation data from eGRID. Tried the various formats with "getInventory," but failed to find 'Electricity' data.~~

  • https://github.com/USEPA/standardizedinventories/blob/master/stewi/init.py#L63
  • https://github.com/USEPA/standardizedinventories/blob/master/stewi/formats.py#L10

Found it here:

  • https://github.com/USEPA/standardizedinventories/blob/master/stewi/init.py#L137

dt-woods avatar Nov 20 '23 18:11 dt-woods

stewi.getInventory('eGRID', year, stewiformat='flowbyfacility') will return a dataframe that includes emissions and Electricity output as a flow.

Note also that stewi.getInventoryFacilities('eGRID', year) includes the fuel type by facility.

My guess is some combination of these generated the files originally but I do not know.

bl-young avatar Nov 20 '23 19:11 bl-young

Example code:

import os

import pandas as pd

from stewi import getInventoryFacilities 
from stewi import getInventory 

def make_egrid_subregion_ref(year):
    """Generate the 'egrid_subregion_generation_inventory_reference' CSV data
    file for a given year (if it does not already exist).

    Parameters
    ----------
    year : ing
        Data year.
    """
    # Define the output file, which should be in data directory of package.
    ref_name = "egrid_subregion_generation_by_fuelcategory_reference_%s.csv" % year
    ref_path = os.path.join(data_dir, ref_name)

    if os.path.exists(ref_path):
        logging.info(
            "eGRID subregion generation inventory %s reference exists" % year)
    else:
        logging.info(
            "Creating eGRID subregion generation inventory "
            "%s reference CSV" % year)

        # Pull the inventory data from stewi.
        a = stewi.getInventory("eGRID", year)

        # Pull facility meta data from stewi.
        meta_cols = [
            'FacilityID',
            'eGRID subregion acronym',
            'Plant primary coal/oil/gas/ other fossil fuel category'
        ]
        b = stewi.getInventoryFacilities("eGRID", 2018)[meta_cols]

        # Merge two data frames together to get inventory + facility metadata.
        c = pd.merge(
            left=a.query("FlowName == 'Electricity'"),
            right=b,
            on="FacilityID",
        )

        # Group by and sum by FacilityID and FuelCategory to get total
        # electricity generation. Update column names to match existing
        # CSV files in the repo.
        c = c.groupby(
            by=[
                'eGRID subregion acronym',
                'Plant primary coal/oil/gas/ other fossil fuel category']
        )['FlowAmount'].agg('sum').reset_index()
        c = c.rename(columns={
                'eGRID subregion acronym': 'Subregion',
                'Plant primary coal/oil/gas/ other fossil fuel category': 'FuelCategory',
                'FlowAmount': 'Electricity'
        })
        # Convert Electricity from MJ to MWh; and order
        c['Electricity'] /= 3600.0
        c = c.sort_values(by=['FuelCategory', 'Subregion'])
        c.to_csv(ref_path, index=False)

dt-woods avatar Nov 20 '23 19:11 dt-woods

^^^ The method above will be added to egrid_facilities.py to create the reference CSV when called in the global space of egrid_energy.py right before the file is accessed to avoid FileNotFound Error.

dt-woods avatar Nov 20 '23 22:11 dt-woods

NOTE: I found no reference to either "egrid_subregion_totals_reference_2016.csv" or "egrid_subregion_totals_reference_2014.csv" so I omitted their creation.

dt-woods avatar Nov 20 '23 22:11 dt-woods