areal
areal copied to clipboard
Add Dasymetric Function
Function, data, and test are not explicitly ready for merge to master, but opening this PR as a way to discuss and further develop the dasymetric function.
Outstanding:
- [ ] Implement Different Weights for Extensive (Sum vs Total)
- [ ] Implement Intensive Calculation (Mean, intermediate area dependent?)
- [ ] The Building Footprints are very large (6MB .rda) Move this to an external repo and make web request? Smaller sample data?
- [ ] Review potential edge cases
- [ ] Full pass at Documentation
- [ ] Full pass at Testing (Contingent on test Building data)
- [ ] Test with Other Intermediate Geometries (Land Use, etc)
While using this dev branch I wasted an inordinate amount of time having not checked my intermediate
shapefiles geometries that weren't POLYGON or MULTIPOLYGON.
Would adding the following check add too much overhead to aw_dasymetric
?
if(any(!st_geometry_type(nyc_buildings_tidy) %in% c("POLYGON", "MULTIPOLYGON"))){
stop('The `intermediate` shapefiles must contain only `POLYGON` and `MULTIPOLYGON` geometries. Remove other geometries before passing to aw_dasymetric')
}
hey @charliejhadley - first, thanks for testing out the development branch. This actually might be worth exploring for the validation process more generally. Can you open an issue referring to this and referencing the ar_validate()
function?
@charliejhadley Thanks for trying out the dasymetric function, and thank you for the solution you provided. It does not add any considerable overhead, and certainly will alleviate some headaches for others. I went ahead and added this for all of the sf arguments.
If you run into any other problems or identify an edge case, please let us know. Thanks again, we're really excited to develop this.
Thanks for adding this, @bransonf!
@charliejhadley - I just opened an issue on your behalf to make a more permanent switch in how we validate data!
Thanks for doing all the work on this, I wanted to check it wouldn't add too much overhead before making a PR.
thanks @charliejhadley!
hey @joshdavids - this is the branch where we're discussing the technique. Have you installed from a repo before as opposed to CRAN?
@chris-prener -- yes, I've installed from repos, but it's been a while, so a refresher would be great.
and, thanks for having me onboard, everyone.
no worries @joshdavids - you want to use remotes::install_github("bransonf/areal")
since @bransonf is using a fork of the areal
repo and not a branch...
@chris-prener, awesome, thank you. Is there anything in particular you want me to take a stab at testing first?
@bransonf - do you want to let @joshdavids know how to test the function?
Sure, and thanks for testing this @joshdavids
You'll need some source sf data with an extensive field (meaning counts, rather than proportion/percentage)
You'll need an intermediate sf object. I've tested only building footprints so far, but this should work with land use (zoning) data as well.
And then you'll need a target, grid squares or hexagons for example.
That's it, and the documentation should be pretty clear as to the order of arguments. (target, source, intermediate)
If you have trouble finding any of these data, here are suggestions:
-
Source - You can use
tidycensus
to get population counts or some other count variable -
Intermediate - You can find labeled building footprints here These are just the Microsoft AI Building Footprints with county/zip etc. added. It makes it easier to filter for your area. Be warned, unless you have a ton of RAM, R may fail to handle these large files.
-
Target - You can generate square/hexagon grids with the
ar_tessellate
function
I'm not familiar with your experience in R, so if I skipped anything important, please let me know.
@bransonf -- all sounds good. I've had a lot of experience working with R (never writing packages, but using it daily in analysis for the last 5ish years) and a fair bit of more recent experience working with sf. The workflow you describe makes sense. Could you point me to where the documentation is housed, I'm having trouble finding it?
@joshdavids Sorry for the confusion. The only documentation currently is the function annotations. ?aw_dasymetric
Or you can read the source
The only required arguments are target
, source
, intermediate
and extensive
. The first 3 are sf objects, and the last is a character vector.
@bransonf -- great, all sounds good. I'll get to testing shortly. Looking forward.
hi all, sorry for the really long delay -- working on a paper for the Transportation Research Board conference due Aug. 1. Would it still be useful to you all for me to test in August when things free up?
@joshdavids No worries, Chris and I have been incredibly busy with covid related work anyway. Whenever you have a chance to test this will be useful to us. There's no rush, we'll probably get back to this sometime in the fall.