Katie Lamb
Katie Lamb
Even though the output layer will eventually be obsolete all the CCAI stuff should go into a PUDL output table for now. This function should include all the new columns...
# Overview Integrate the `splink` model into PUDL to conduct the FERC to EIA entity matching. Closes [32](https://github.com/catalyst-cooperative/ccai-entity-matching/issues/32) and [10](https://github.com/catalyst-cooperative/ccai-entity-matching/issues/110) but more generally is one of the last lingering things...
In the process of setting up general record linkage infrastructure for the FERC - EIA match, we added experiment tracking infrastructure with MLflow that allows developers to track model parameters...
We are interested in archiving Ex. 21 of the SEC 10K filings which is a PDF attachment that's not published as part of the 10K XBRL filings. [CorpWatch](https://www.corpwatch.org/) has scraped...
One of the lessons from previous unstructured data extraction projects is the necessity to collect as much metadata as possible, and be able to go from a structured database of...
Now that we have more generalized experiment tracking and record linkage infrastructure it's become harder to figure out how to set up a new model or add experiment tracking. Add...
### Description The first line of work for the Mozilla AI for Environmental Justice grant to perform record linkage between EIA utility data and SEC utility ownership (proposal [here](https://docs.google.com/document/d/1-WIWes1tyophw0vImjhv3DmkefVyQXW54_C_Xk2YJCI/edit?usp=sharing)). This...
Corpwatch publishes the Ex. 21 data in a structured machine readable format via their API. While it doesn't have the percentage ownership, this data would be good for us to...
As was pointed out in #3251 , currently `test_validate_override_fixes` is an integration test for the FERC to EIA entity matching module that tests the functionality of `validate_override_fixes`. The test has...