rgeo icon indicating copy to clipboard operation
rgeo copied to clipboard

Add dataset: US_Counties10

Open whilei opened this issue 1 year ago • 4 comments
trafficstars

Thanks for this great lib. I've added US_Counties10, and thought I'd PR in case that's something you'd like to have here too.

du -sh data/US_Counties10.gz 
1.7M    data/US_Counties10.gz

whilei avatar Dec 08 '23 18:12 whilei

Reasons you may not want to merge this include that US counties are not exactly well standardized. For example, Alaska has Boroughs and Municipalities and Census Areas, no counties; Luisiana: parishes, no counties. https://github.com/kjhealy/fips-codes/blob/master/county_fips_master.csv

But, with that said, the feature does in fact add US Counties, so as long as the user is actually expecting Counties -- rather than some ambiguous area smaller than states but larger than cities, conventionally and idiomatically "counties" -- this will return those as they actually are.

A workaround for this could be to group all the possible subdivision types under an assumed alias of County, potentially yielding more conventionally-expected results.

cat data/US_Counties10.geojson | jj -p | grep TYPE | grep -v TYPE_EN | sort | uniq -c
     13         "TYPE": "Borough", 
     11         "TYPE": "Census Area", 
     41         "TYPE": "City", 
      4         "TYPE": "City and Borough", 
   3008         "TYPE": "County", 
      3         "TYPE": "(County Equivalent)", 
      1         "TYPE": "District of Columbia", 
      2         "TYPE": "Municipality", 
     77         "TYPE": "Municipio", 
     64         "TYPE": "Parish", 

whilei avatar Dec 08 '23 19:12 whilei

Hi, thanks for the PR. I'm not really looking to add more datasets into the lib, but I would be happy to add the counties field to allow you to get that info when using datasets that aren't included. There should probably be a way to get any arbitrary field from the geojson, but I'd be happy to add this for now.

sams96 avatar Dec 08 '23 21:12 sams96

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Comparison is base (9db03ae) 98.58% compared to head (4193ef2) 98.63%.

Additional details and impacted files
@@            Coverage Diff             @@
##           master      #28      +/-   ##
==========================================
+ Coverage   98.58%   98.63%   +0.04%     
==========================================
  Files           2        2              
  Lines         141      146       +5     
==========================================
+ Hits          139      144       +5     
  Misses          1        1              
  Partials        1        1              

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

codecov[bot] avatar Dec 08 '23 21:12 codecov[bot]

Its fine with me either way, I'm happy to use a fork for what I need it for. I think moving toward arbitrary fields from whatever datasets are provided is a very good idea. It would move the library away from being a provider of data structures (which could become a nightmare, and is the real limiter for adding new datasets), and more towards the business of just handling the geometries.

whilei avatar Dec 09 '23 15:12 whilei