justice40-tool
justice40-tool copied to clipboard
Income imputation clean up
Goals:
- [x] Use rtree spatial join to impute
- [x] Verify state and county are relevant
- [ ] Ensure imputation is performing as planned
** Score Deployed! **
Find it here:
- Score Full usa.csv: https://justice40-data.s3.amazonaws.com/data-pipeline-staging/1881/6d9e11d0816dfa2fdad5bc0a7c7267eb01be63a0/data/score/csv/full/usa.csv
- Download Zip Packet: https://justice40-data.s3.amazonaws.com/data-pipeline-staging/1881/6d9e11d0816dfa2fdad5bc0a7c7267eb01be63a0/data/score/downloadable/Screening_Tool_Data.zip
** Map Deployed! **
Map with Staging Backend: https://screeningtool.geoplatform.gov/en?flags=stage_hash=1881/6d9e11d0816dfa2fdad5bc0a7c7267eb01be63a0
Find tiles here: https://justice40-data.s3.amazonaws.com/data-pipeline-staging/1881/6d9e11d0816dfa2fdad5bc0a7c7267eb01be63a0/data/score/tiles
Also a question for the braintrust: does this require testing do we think? it's mostly just using the sjoin method from gpd, so I was leaning to no. It won't be that hard to test -- we can just take 10 adjacent tracts and make a few missing or whatever in fake data... but is it overkill? (cc @mattbowen-usds)
@emma-nechamkin I personally would, to exercise the tract -> county -> state fallback behavior and document that. The spatial join itself isn't that interesting to test, but the actual behavior of the imputation is.
@emma-nechamkin I personally would, to exercise the tract -> county -> state fallback behavior and document that. The spatial join itself isn't that interesting to test, but the actual behavior of the imputation is.
This makes sense to me @mattbowen-usds -- i think i can probably do that with either a unit test or simple asserts. I can wrap this up in the am
@emma-nechamkin - is this still in process? Totally OK if it is! Just looking to clean up some of tickets to prep for Sprint Planning. TY
This is not going to be completed by the time I leave today and I suggest deferring it to a refactor.