gridemissions icon indicating copy to clipboard operation
gridemissions copied to clipboard

Bulk downloads only cover the last 5 months

Open zaneselvans opened this issue 1 year ago • 3 comments

I was just looking at the bulk downloads listed on this page and found that some of them don't cover the expected range of dates.

zaneselvans avatar Dec 29 '23 21:12 zaneselvans

The newer files date back to the updates I made for the EIA's switch to v2 of their API (#12). At that time, I also updated the gridemissions API to now only store one month of historical data. What I did not do back then is go back and process the historical data - breaking the description on the page you referenced.

Updating the historical data was less straightforward than I initially thought because the EIA also changed the way they were making data available in bulk. Instead of one giant csv file with all of the data - they now make the data available in six-month chunks and split it into two files (see here).

Since the datasets generated by this codebase directly depend on the Grid Monitor dataset I think it makes sense for the tools here to also process data in the same six-month chunks.

I just opened #20 to process those six-month datasets and launched the workflow in that PR's Makefile. Eleven six-month chunks are currently available. The CvxCleaner step takes about 1 hour per six-month chunk on the machine I am using, the other steps are not very expensive. I'd like to do a bit of sanity checking once that run completes. If that all looks good, I'll update the way these files are distributed and update the description correspondingly.

Thank you for the reminder to do this!

@ktehranchi FYI

jdechalendar avatar Jan 01 '24 14:01 jdechalendar

FYI, we have some functions in the OGE repo to download and process the six month files into the format you use here. Not sure if this would be helpful.

grgmiller avatar Jan 02 '24 16:01 grgmiller

FYI, we have some functions in the OGE repo to download and process the six month files into the format you use here. Not sure if this would be helpful.

Just saw this. Your code looks similar to what I did here.

BTW, I noticed when looking at your code that I looks like you are using the conventions I used to use for the EIA API before their v2 update. More recent versions of the code in this repo use a different convention. See eia_api_v2.py.

jdechalendar avatar Jan 16 '24 22:01 jdechalendar