tigris icon indicating copy to clipboard operation
tigris copied to clipboard

Minnesota 2012-2013 Census Blocks pulls in other states

Open frumh002 opened this issue 5 months ago • 5 comments

Hey there,

I'm working on a project that requires I use all US census blocks 2007-2024. I'm working through the code and came across an issue that so far I've only seen in MN 2012-2013. It pulls a few census blocks from surrounding states. Here is 2012 for example - and the same thing for 2013. I haven't seen any other years or states so far.

Image

It's an easy fix on my end, but wondering what might be going on?

Thanks

frumh002 avatar Jul 23 '25 17:07 frumh002

Well this is a bit odd! I am getting a different result using tigris versus downloading the file directly from the URL (obtained directly from debugging the tigris::blocks function). I have Tigris 2.2.1 installed.

library(sf)
library(tigris)

tf <- tempfile(fileext = '.zip')
download.file('www2.census.gov/geo/tiger/TIGER2012/TABBLOCK/tl_2012_27_tabblock.zip', destfile = tf)
tf_extracted <- unzip(tf, exdir = tempdir())
bk12 <- read_sf(tf_extracted[grep(tf_extracted, pattern = 'shp$')])
table(bk12$STATEFP)

# 27 
# 263606 


bk12_tigris <- tigris::blocks(state = 'MN', year = 2012, protocol = 'http')
table(bk12_tigris$STATEFP)

# 19     27     46 
# 3 263602      1 

lpiep avatar Jul 23 '25 18:07 lpiep

Ah, this does actually appear in the raw census file. The file downloaded directly above (bk12 in my example) has these other state FIPS codes in the STATEFP10 field.

> table(bk12$STATEFP, bk12$STATEFP10)
    
           19         27    46
  27       3      263602    1

and the default behavior of the load_tigris function is to overwrite STATEFP with STATEFP10.

@walkerke should there be an exception written into the function for this? Or is it just "that's how the data is".

lpiep avatar Jul 23 '25 18:07 lpiep

I wonder if instead of this in load_tiger() (via https://github.com/walkerke/tigris/blob/master/R/helpers.R#L394-L397)

    if ("COUNTYFP10" %in% names(obj)) {
        obj$COUNTYFP <- obj$COUNTYFP10
        obj$STATEFP <- obj$STATEFP10
    }

It could be something like this:

    if ("COUNTYFP10" %in% names(obj) && !("COUNTYFP" %in% names(obj)) {
        obj$COUNTYFP <- obj$COUNTYFP10
        obj$STATEFP <- obj$STATEFP10
    }

Not sure if the cases where the name corrections in load_tiger() are required is fully documented so this may or may not be a preferred solution. It is a weird though - I checked the 2012 technical documentation and didn't see anything like this issue mentioned.

elipousson avatar Jul 23 '25 19:07 elipousson

For now, given I need to use the TIGRIS package but don't want to deal with these other state blocks, I'm just going to filter out the state FIPS that don't belong to the actual state I'm pulling. It's a temporary workaround but yea this stumped me too. I thought that maybe there were tiny historical variations in the border block groups, but even if those did "belong" to Iowa or some other state that year, they shouldn't be pulled for MN that same year?

Anywho - thanks for the help and hope this gets resolve? I'll leave it open for now.

frumh002 avatar Jul 24 '25 14:07 frumh002

Sorry I still don't know how to leave a comment without closing the issue...so reopening with this one...

frumh002 avatar Jul 24 '25 14:07 frumh002