pudl
pudl copied to clipboard
Incorrect (or non-varying) `plant_name_eia`
Describe the bug
I would like to be able to use PUDL to access the current plant_name_eia
of a plant. Plant names do change over time, but it seems like pudl currently treats this as a static attribute. This seems to have been discussed in this thread (https://github.com/catalyst-cooperative/pudl/issues/1748#issuecomment-1183505797) but was ultimately not implemented.
In the real world, we often work with plant-level data that may have a non-standardized plant name associated with it, but in order to link this data to any of the EIA/CEMS/OGE data that we have, we need to map these non-standardized names to the EIA IDs first. The easiest way to do this is to programmatically fuzzy match the list of non-standardized plant names to their EIA plant names, and thus their EIA IDs. However, when the plant names are out of date, this fuzzy matching is less effective.
One specific example: plant 57037
is listed as "Kemper County" in PUDL, even though the name of this plant is reported as "Ratcliffe" in EIA.
Bug Severity
How badly is this bug affecting you?
- Medium: With some effort, I can work around the bug.
To Reproduce
N/A
Expected behavior
I would like to have an annually-varying plant_name_eia
attribute that can be accessed, even if it is in addition to the static name used in the plants entity table.
Software Environment?
N/A
Additional context
None