albersusa icon indicating copy to clipboard operation
albersusa copied to clipboard

Weed out data

Open hadley opened this issue 7 years ago • 1 comments

Given:

## $ geo_id              <chr> "0400000US04", "0400000US05", "0400000US06", "0400000US08", "0400000US09", "0400000US11...
## $ fips_state          <chr> "04", "05", "06", "08", "09", "11", "13", "17", "18", "22", "27", "28", "30", "35", "38...
## $ name                <chr> "Arizona", "Arkansas", "California", "Colorado", "Connecticut", "District of Columbia",...
## $ lsad                <chr> "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "",...
## $ census_area         <dbl> 113594.084, 52035.477, 155779.220, 103641.888, 4842.355, 61.048, 57513.485, 55518.930, ...
## $ iso_3166_2          <chr> "AZ", "AR", "CA", "CO", "CT", "DC", "GA", "IL", "IN", "LA", "MN", "MS", "MT", "NM", "ND...
## $ census              <int> 6392017, 2915918, 37253956, 5029196, 3574097, 601723, 9687653, 12830632, 6483802, 45333...
## $ pop_estimataes_base <int> 6392310, 2915958, 37254503, 5029324, 3574096, 601767, 9688681, 12831587, 6484192, 45334...
## $ pop_2010            <int> 6411999, 2922297, 37336011, 5048575, 3579345, 605210, 9714464, 12840097, 6490308, 45455...
## $ pop_2011            <int> 6472867, 2938430, 37701901, 5119661, 3590537, 620427, 9813201, 12858725, 6516560, 45759...
## $ pop_2012            <int> 6556236, 2949300, 38062780, 5191709, 3594362, 635040, 9919000, 12873763, 6537632, 46047...
## $ pop_2013            <int> 6634997, 2958765, 38431393, 5272086, 3599341, 649111, 9994759, 12890552, 6570713, 46292...
## $ pop_2014            <int> 6731484, 2966369, 38802500, 5355866, 3596677, 658893, 10097343, 12880580, 6596855, 4649...

I think you should only save:

## $ geo_id              <chr> "0400000US04", "0400000US05", "0400000US06", "0400000US08", "0400000US09", "0400000US11...
## $ fips_state          <chr> "04", "05", "06", "08", "09", "11", "13", "17", "18", "22", "27", "28", "30", "35", "38...
## $ name                <chr> "Arizona", "Arkansas", "California", "Colorado", "Connecticut", "District of Columbia",...
## $ census_area         <dbl> 113594.084, 52035.477, 155779.220, 103641.888, 4842.355, 61.048, 57513.485, 55518.930, ...
## $ iso_3166_2          <chr> "AZ", "AR", "CA", "CO", "CT", "DC", "GA", "IL", "IN", "LA", "MN", "MS", "MT", "NM", "ND...

Possibly standardising variables to an R package that provides population data

hadley avatar Jan 27 '17 13:01 hadley

Seconded.

bhaskarvk avatar Feb 21 '17 17:02 bhaskarvk