pypsa-usa
pypsa-usa copied to clipboard
Consolidate Census Sources
Feature Request
When building demand based on FERC 714, we pull Census DP1 data from PUDL. The sector coupled data uses separately downloaded census files to calculate urban/rural populations.
We should consolidate these sources! (ie. use the PUDL data for the sector coupling side as well)
Suggested Solution
A couple notes:
- Seems pudl uses 2010 census, while sector currently uses 2016
- I think PUDL pulls the geodatabase files; Im not sure if the urban/rural proportions are actually included in there. They may be, though!
Additional Info
con = duckdb.connect(database=":memory:", read_only=False)
con.execute("INSTALL sqlite;")
con.execute("LOAD sqlite;")
con.execute("ATTACH 'censusdp1tract.sqlite' (TYPE SQLITE);")
sql = """
SELECT *
FROM
censusdp1tract.state_2010census_dp1;
"""
df = duckdb.query(sql).to_df()
> df.shape
> (52, 197)
> df.columns
> Index(['objectid', 'shape', 'geoid10', 'stusps10', 'name10', 'aland10', 'awater10', 'intptlat10', 'intptlon10', 'dp0010001', ... 'dp0200001', 'dp0210001', 'dp0210002', 'dp0210003', 'dp0220001', 'dp0220002', 'dp0230001', 'dp0230002', 'shape_length', 'shape_area'], dtype='object', length=197)
> df.index
> RangeIndex(start=0, stop=52, step=1)