pudl icon indicating copy to clipboard operation
pudl copied to clipboard

Integrate the new FERC 1 XBRL archive into the PUDL Datastore.

Open zschira opened this issue 2 years ago • 1 comments

Background

In order to ingest XBRL data into PUDL, we need a datastore that can interpret XBRL archives (#1593). The archives consist of a set of XBRL filings, and some metadata pulled from the RSS feed, and stored in a JSON file. The metadata provides a list of filings (with additional info like the date-time the filings was submitted) submitted by an individual filer for a specified year and period. This is required because filers are able to resubmit filings at any point in time, so there may be multiple filings for filer for a specific year/period, and PUDL must know which filing to use.

Design

The datastore will open the metadata file, and find the most recent filing for every filer/year/period combo. We will assume that the most recent filing is the best one to process. It will then read this files into in-memory buffers which will be passed to the XBRL extractor.

zschira avatar Jun 02 '22 15:06 zschira

is this finished? or is it finished enough in the xbrl_integration branch

cmgosnell avatar Jul 06 '22 18:07 cmgosnell

This is so finished.

zaneselvans avatar Nov 17 '22 22:11 zaneselvans