python-us icon indicating copy to clipboard operation
python-us copied to clipboard

Add counties

Open owenam opened this issue 7 years ago • 7 comments

This adds name and FIPS data for US counties (and county equivalents like parishes), and associates them with the appropriate states. County data is pulled from the Census' 2010 TIGER data, http://www2.census.gov/geo/tiger/TIGER2010/COUNTY/2010/tl_2010_us_county10.zip.

My motivation was to have this data available to support "within county" calls to the Census API for smaller geographies like tracts and block groups. I expect that the most common use case will be to get the list of county FIPS codes for all counties in a given state.

County data is linked to states and can be accessed in a few different ways:

us.counties.COUNTIES  # All US counties
us.states.MN.counties  # Counties in Minnesota
us.counties.lookup('27')  # Lookup by state FIPS code
us.counties.lookup('27053')  # Lookup by combined state+county FIPS code
us.counties.lookup('Hennepin')  # Lookup by short name

Note that all county lookups return a list.

len(us.counties.lookup('Washington'))  # 31 states have a county named Washington
len(us.counties.lookup('Hennepin'))  #  1 state has a county named Hennepin
len(us.counties.lookup('27053'))  # Returns 1-element list for consistency
us.counties.lookup('27053')[0].state  # Counties are linked back to states

I've left in some bits that might not be necessary, such as the code in load_county_data.py and county_schema.sql for loading the county data into sqlite.

Finally there may be better ways to handle the lookups and linking to states, I'm happy to discuss!

owenam avatar Mar 27 '17 16:03 owenam

@owenam I was quite hesitant to add counties when previously proposed, but this looks really, really great. Based on a quick look, I think the API makes a lot of sense. Give me a couple days to play around with it and see if I have any feedback. Ping me if you don't hear anything by the end of the week 🙂

Thanks!

jcarbaugh avatar Mar 27 '17 21:03 jcarbaugh

I was on the fence as well, since its quite a bit more data! But this felt like a better alternative than repeatedly hitting the SF1 API just to get county codes, or repeatedly fetching and reading the TIGER shapefiles.

The part I feel the least certain about is probably the handling of the state<->county linking in __init__.py — I haven't had a reason to do anything like that before so I'm not sure if there are better approaches.

owenam avatar Mar 28 '17 01:03 owenam

@jcarbaugh just checking in — what are your thoughts on this?

owenam avatar Apr 03 '17 16:04 owenam

This feature would be great. A few comments and questions.

Counties/Parishes/Boroughs change more often than states. And Virginia FIPS codes... How does this stay up to date? Should there be point-in-time lookup?

How to define __repr__ isn't entirely obvious. "<County: Autuga>" or "<County: Autauga, AL>"? "<County: Denali Borough>" or "<Borough: Denali>". That said, the way it is defined now looks perfectly workable.

Should counties.lookup support counties.lookup("DeKalb, GA")?

In states.lookup, passing bytes raises TypeError: expected unicode, got str. As long as the input is ascii, it would be nice to just convert it on the front-end.

There are going to be a lot of corner cases with fuzzy matching. How robustly should those be supported? Will jellyfish.metaphone match "St Francis, AR" with "St. Francis, AR"? Hyphenation may also be a sticking point.

To reiterate, this is a great feature and I'll be happy to help with any of the above.

jbrockmendel avatar Jun 11 '17 18:06 jbrockmendel

County support would be fantastic. Anything an "outsider" can do to help move this along?

sglyon avatar Dec 05 '17 16:12 sglyon

+1

eedduuar avatar Jan 16 '18 14:01 eedduuar

Hi, any chance to include this?

I'm up for fixing whatever needs more work in this pull request.

marcin-osowski avatar Feb 05 '19 19:02 marcin-osowski