311-data
311-data copied to clipboard
Research incorporating population data into 311-data reports/dashboards
Overview
Comparing neighborhoods by the number of requests when each neighborhood council differs widely in population size can be misleading.
Action Items
- [ ] Research ways to join 311-data requests with population data
Resources/Instructions
@chelseybeck each NC serves about 40,000 people. They have subdivided in the past when they larger than that. Is this still needed?
@ExperimentsInHonesty , I found this census data from 2010 by neighborhood council. According to this data, there is quite a large spread in population size of the different neighborhood councils. You mentioned that they are supposed to serve ~40k people, do you know if this is something that changed after 2010? It is not clear to me what neighborhood council boundaries were used in the analysis but it seems that it was updated fairly recently in 2020.
Here is the link to the data: https://data.lacity.org/Community-Economic-Development/Census-Data-by-Neighborhood-Council/nwj3-ufba
Here is the population size histogram:
If there is indeed such a population spread it would be worth normalizing the data by population size.
To adjust the data by population, I intersected the census block 2020 data from LA county ARCgis with the neighborhood councils boundaries to obtain population estimates for each neighborhood council. The analysis yielded table NC_pop_2020.csv, which was used to adjust 311 requests number by population in neighborhood.py dashboard.
To recap, Anupriya and Piero have both been working on ways to get more up-to-date population data per-NC.
Action items for @priyakalyan and @piotrsan:
- [ ] Explain why getting population counts is a difficult problem (new NCs? other things? I don't quite remember).
- [ ] Create write ups of your methodologies to get the population data. You can post them on this issue for now.
- [ ] Post the results of your methodologies. Please create a single Google Sheet that includes methodologies A, B, and C. Also include percentage differences per NC between different methodologies (e.g., we need columns like "Percentage difference between A and B")
- [ ] Based on the comparisons, pick a winner. Obviously, we have no ground truth here, so we'll likely need to make a judgement call. If all methodologies give similar results, I would lean towards choosing the simplest one. The simplest one will be easiest to maintain and explain.
- [ ] If necessary, merge the code to produce our chosen methodology.
- [ ] Find a centralized location to publish our population data and share it with the team so that other data scientists can control for population in their analyses.
There is no available table of population broken down by neighborhood council. The only data available is derived from 2010 census data using old NC boundaries (when there used to only be 97 NCs instead of the current 99).
We have preliminary data from acrGIS derived by intersecting current 99 NC boundaries with 2020 census block population data. It is not clear how this is done in the background of arcGIS.
Anupriya is doing the analysis from start to end. This should be used as the final population estimates table for any normalization needs.
Here is a notebook detailing the calculation to determine the updated LA neighborhood council population using geospatial analysis.
This notebook compares the 2 methods (arcGIS and geopandas) that were used to evaluate the recent NC population.
Check pop_compare to access the google sheet with all the updated information.
I am attaching the csv file too- in case there is some issue opening the google sheet. pop_compare.csv
@priyakalyan have you seen this? It's an approximation of population by neighborhood for LA, but the neighborhoods that they are using are different (and more granular) than actual Neighborhood Councils. At the bottom, they explain their methodology. They are taking a winner-take-all approach, meaning that they are not attempting to split census tracts across neighborhood boundaries.
The LATimes also publishes population density stats, but again, the neighborhoods that they use are different from actual NCs. I couldn't find their exact methodology, but I bet they'd be willing to help us if we emailed them ([email protected]).
@nichhk I did see that long time back when I was extensively searching for pop details. But I did not explore it further back then. I can take a look at it. Thanks for the feedback!
Here is the 2020 estimates for Los Angeles from the Census Bureau.
https://www.census.gov/quickfacts/fact/table/losangelescitycalifornia/PST045221
@nichhk I emailed LA times last week requesting them to get back to us with their methodology in calculating the LA city population. Still waiting for their response.
Here is an updated notebook to calculate the population of the LA city NCs after adding area and population filter. In this recent version, the total population of the all the NCs is very close to the Census bureau value.
@priyakalyan were you able to hear back from the LA Times?
@akhaleghi I did not hear back from them yet. It has been more than 10 days. I wrote a follow up email today.
Hi @priyakalyan, are there any updates on this issue?
I have incorporated changes to updated_NC_pop repo based on the feedback given by @salice. I will be creating a PR and add it to the 311 repo by this week.
Hey @priyakalyan could you provide a brief update on this issue? (I know you're waiting for a review but we just want to keep the status up-to-date here)
Hi @akhaleghi, sure. I have created a PR to address the NC population issue. Waiting for the review process.
Here is an update- second round of review done. The review process is still going on.
This issue can be closed now! Here is a link to the csv file with the updated population, population density and area (in square miles) of all the 99 NCs.
Anupriya, thank you for your persistence on this very challenging issue!