counties() - State Column Name Discrepancy
counties(), line 93, fails for 2010
ctys = _load_tiger(url, cache = cache, subset_by = subset_by)
if state is not None:
if type(state) is not list:
state = [state]
valid_state = [validate_state(x) for x in state]
ctys = ctys.query('STATEFP in @valid_state') #STATEFP not in 2010 shapefile columns
E.g.
from pygris import counties
import us
import random
state_list = [{"name":s.name, "fips":s.fips, 'usps':s.abbr} for s in us.states.STATES]
n = random.randint(0,50)
rand_state = state_list[n] # e.g. {'name': 'Nebraska', 'fips': '31', 'usps': 'NE'}
counties(state=rand_state['fips'], year=2010)
UndefinedVariableError: name 'STATEFP' is not defined
Link to TIGER file used in above example: https://www2.census.gov/geo/tiger/TIGER2010/COUNTY/2010/tl_2010_us_county10.zip
from pygris.helpers import _load_tiger
gdf =_load_tiger("https://www2.census.gov/geo/tiger/TIGER2010/COUNTY/2010/tl_2010_us_county10.zip")
print(gdf.head())
STATEFP10 COUNTYFP10 COUNTYNS10 GEOID10 NAME10
0 02 013 01419964 02013 Aleutians East
1 02 016 01419965 02016 Aleutians West
Made quick check of what alternative schemas there are (https://github.com/apsocarras/pygris/blob/issue-4/reprex/reprex.ipynb)
schema_str, count "AREA, PERIMETER, CO99_D00_, CO99_D00_I, STATE, COUNTY, NAME, LSAD, LSAD_TRANS, geometry", 3 "AREA, PERIMETER, CO99_D90_, CO99_D90_I, ST, CO, NAME, geometry", 3 "GEO_ID, STATE, COUNTY, NAME, LSAD, CENSUSAREA, geometry", 3 "STATEFP00, COUNTYFP00, CNTYIDFP00, NAME00, NAMELSAD00, LSAD00, CLASSFP00, MTFCC00, UR00, FUNCSTAT00, ALAND00, AWATER00, INTPTLAT00, INTPTLON00, geometry", 3 "STATEFP10, COUNTYFP10, COUNTYNS10, GEOID10, NAME10, NAMELSAD10, LSAD10, CLASSFP10, MTFCC10, CSAFP10, CBSAFP10, METDIVFP10, FUNCSTAT10, ALAND10, AWATER10, INTPTLAT10, INTPTLON10, geometry", 3
My proposal is to change the validation to check for any of the listed variants of the 'STATEFP' column (STATE, ST, STATEFP00, STATEFP10)