gridemissions
gridemissions copied to clipboard
Bulk downloads only cover the last 5 months
I was just looking at the bulk downloads listed on this page and found that some of them don't cover the expected range of dates.
- The reconciled electricity data is supposed to go back to July, 2018, but only reaches July, 2023 (there are ~3400 hourly records in the file).
- The CO2 emissions data have the same truncated timeline.
- The raw EIA-930 data is also truncated to July, 2023
- The older reconciled electricity data contains the expected range of dates (2015-07-01 to 2018-07-01, 26,300 hourly records).
The newer files date back to the updates I made for the EIA's switch to v2 of their API (#12). At that time, I also updated the gridemissions API to now only store one month of historical data. What I did not do back then is go back and process the historical data - breaking the description on the page you referenced.
Updating the historical data was less straightforward than I initially thought because the EIA also changed the way they were making data available in bulk. Instead of one giant csv file with all of the data - they now make the data available in six-month chunks and split it into two files (see here).
Since the datasets generated by this codebase directly depend on the Grid Monitor dataset I think it makes sense for the tools here to also process data in the same six-month chunks.
I just opened #20 to process those six-month datasets and launched the workflow in that PR's Makefile
. Eleven six-month chunks are currently available. The CvxCleaner
step takes about 1 hour per six-month chunk on the machine I am using, the other steps are not very expensive. I'd like to do a bit of sanity checking once that run completes. If that all looks good, I'll update the way these files are distributed and update the description correspondingly.
Thank you for the reminder to do this!
@ktehranchi FYI
FYI, we have some functions in the OGE repo to download and process the six month files into the format you use here. Not sure if this would be helpful.
FYI, we have some functions in the OGE repo to download and process the six month files into the format you use here. Not sure if this would be helpful.
Just saw this. Your code looks similar to what I did here.
BTW, I noticed when looking at your code that I looks like you are using the conventions I used to use for the EIA API before their v2 update. More recent versions of the code in this repo use a different convention. See eia_api_v2.py
.