pudl
pudl copied to clipboard
`pudl.analysis.allocate_gen_fuel` is dropping/adding data
Describe the bug
As part of our OGE validation checks, we compare the fuel and emissions totals in the generation_fuel_by_generator_energy_source_monthly_eia923
table to the generation and fuel totals in generation_fuel_eia923
, since these should match (i.e. the allocate_gen_fuel process should only allocate the data, but not add/drop data). This validation check returned the following warning:
It appears for a small number of plants that the allocate_gen_fuel pipeline is either adding or subtracting a large amount of generation or fuel to the totals. I haven't yet been able to trace why this might be happening.
Bug Severity
How badly is this bug affecting you?
- Medium: With some effort, I can work around the bug.
To Reproduce
Compare the generation and fuel totals from EIA-923 to the data in the generation_fuel_by_generator_energy_source_monthly_eia923
table
Expected behavior
The total generation and fuel for each plant should be the same before and after allocate_gen_fuel
Digging into this further, I started looking into plant 54809
, which the validation check shows missing some fuel consumption data, but not net generation data. This plant has 6 IC petroleum generators, and one ST natural gas generator (which reportedly retired in September 2022).
It looks like the source of this error is that the EIA-923 generation fuel table reports the ST generator continuing to consume fuel in Sept-Dec, after it retired, even though there is no generation data reported for these months.
Strangely, this plant reports two different rows for each prime-mover/fuel combination, one with the "CHP plant" flag as yes, and one as no. In september, the fuel consumption data switches from CHP to non-CHP for the same prime mover/fuel.
In this case at least, perhaps dropping this fuel data is correct due to the retirement.
Ok it turns out that one issue was that when I was comparing inputs to outputs, I was comparing the outputs to generation_fuel_eia923
table, which I didn't realize excluded nuclear fuel data (which was the root cause of some of these inconsistencies). After switching to denorm_generation_fuel_combined_eia923
, we get the following:
Plant 10613 is the plant that reports negative fuel consumption in May, so I am assuming this value is getting filtered out as anomalous.
Some of these other ones (58256, 59817) appear to have inconsistent prime mover codes between the generators table and the generation fuel table, which is resulting in some of this data getting dropped. I had thought we had implemented a check for this in this module. However, these plants are PV/BA plants with small amounts of generation, so the inconsistent generation/fuel totals should not be a big issue at this point.