311-data icon indicating copy to clipboard operation
311-data copied to clipboard

Research incorporating population data into 311-data reports/dashboards

Open chelseybeck opened this issue 3 years ago • 17 comments

Overview

Comparing neighborhoods by the number of requests when each neighborhood council differs widely in population size can be misleading.

Action Items

  • [ ] Research ways to join 311-data requests with population data

Resources/Instructions

311-Data Project Onboarding Tech Stack

chelseybeck avatar Oct 15 '21 22:10 chelseybeck

@chelseybeck each NC serves about 40,000 people. They have subdivided in the past when they larger than that. Is this still needed?

ExperimentsInHonesty avatar Feb 18 '22 03:02 ExperimentsInHonesty

@ExperimentsInHonesty , I found this census data from 2010 by neighborhood council. According to this data, there is quite a large spread in population size of the different neighborhood councils. You mentioned that they are supposed to serve ~40k people, do you know if this is something that changed after 2010? It is not clear to me what neighborhood council boundaries were used in the analysis but it seems that it was updated fairly recently in 2020.

Here is the link to the data: https://data.lacity.org/Community-Economic-Development/Census-Data-by-Neighborhood-Council/nwj3-ufba

Here is the population size histogram: image

If there is indeed such a population spread it would be worth normalizing the data by population size.

piotrsan avatar Jun 11 '22 07:06 piotrsan

To adjust the data by population, I intersected the census block 2020 data from LA county ARCgis with the neighborhood councils boundaries to obtain population estimates for each neighborhood council. The analysis yielded table NC_pop_2020.csv, which was used to adjust 311 requests number by population in neighborhood.py dashboard.

piotrsan avatar Jul 19 '22 05:07 piotrsan

To recap, Anupriya and Piero have both been working on ways to get more up-to-date population data per-NC.

Action items for @priyakalyan and @piotrsan:

  • [ ] Explain why getting population counts is a difficult problem (new NCs? other things? I don't quite remember).
  • [ ] Create write ups of your methodologies to get the population data. You can post them on this issue for now.
  • [ ] Post the results of your methodologies. Please create a single Google Sheet that includes methodologies A, B, and C. Also include percentage differences per NC between different methodologies (e.g., we need columns like "Percentage difference between A and B")
  • [ ] Based on the comparisons, pick a winner. Obviously, we have no ground truth here, so we'll likely need to make a judgement call. If all methodologies give similar results, I would lean towards choosing the simplest one. The simplest one will be easiest to maintain and explain.
  • [ ] If necessary, merge the code to produce our chosen methodology.
  • [ ] Find a centralized location to publish our population data and share it with the team so that other data scientists can control for population in their analyses.

nichhk avatar Jul 19 '22 20:07 nichhk

There is no available table of population broken down by neighborhood council. The only data available is derived from 2010 census data using old NC boundaries (when there used to only be 97 NCs instead of the current 99).

We have preliminary data from acrGIS derived by intersecting current 99 NC boundaries with 2020 census block population data. It is not clear how this is done in the background of arcGIS.

Anupriya is doing the analysis from start to end. This should be used as the final population estimates table for any normalization needs.

piotrsan avatar Jul 28 '22 04:07 piotrsan

Here is a notebook detailing the calculation to determine the updated LA neighborhood council population using geospatial analysis.

This notebook compares the 2 methods (arcGIS and geopandas) that were used to evaluate the recent NC population.

Check pop_compare to access the google sheet with all the updated information.

I am attaching the csv file too- in case there is some issue opening the google sheet. pop_compare.csv

priyakalyan avatar Jul 28 '22 05:07 priyakalyan

@priyakalyan have you seen this? It's an approximation of population by neighborhood for LA, but the neighborhoods that they are using are different (and more granular) than actual Neighborhood Councils. At the bottom, they explain their methodology. They are taking a winner-take-all approach, meaning that they are not attempting to split census tracts across neighborhood boundaries.

The LATimes also publishes population density stats, but again, the neighborhoods that they use are different from actual NCs. I couldn't find their exact methodology, but I bet they'd be willing to help us if we emailed them ([email protected]).

nichhk avatar Jul 28 '22 22:07 nichhk

@nichhk I did see that long time back when I was extensively searching for pop details. But I did not explore it further back then. I can take a look at it. Thanks for the feedback!

priyakalyan avatar Jul 29 '22 01:07 priyakalyan

Here is the 2020 estimates for Los Angeles from the Census Bureau.

https://www.census.gov/quickfacts/fact/table/losangelescitycalifornia/PST045221

piotrsan avatar Jul 29 '22 01:07 piotrsan

@nichhk I emailed LA times last week requesting them to get back to us with their methodology in calculating the LA city population. Still waiting for their response.

Here is an updated notebook to calculate the population of the LA city NCs after adding area and population filter. In this recent version, the total population of the all the NCs is very close to the Census bureau value.

priyakalyan avatar Aug 09 '22 04:08 priyakalyan

@priyakalyan were you able to hear back from the LA Times?

akhaleghi avatar Aug 16 '22 20:08 akhaleghi

@akhaleghi I did not hear back from them yet. It has been more than 10 days. I wrote a follow up email today.

priyakalyan avatar Aug 16 '22 20:08 priyakalyan

Hi @priyakalyan, are there any updates on this issue?

akhaleghi avatar Aug 31 '22 17:08 akhaleghi

I have incorporated changes to updated_NC_pop repo based on the feedback given by @salice. I will be creating a PR and add it to the 311 repo by this week.

priyakalyan avatar Aug 31 '22 23:08 priyakalyan

Hey @priyakalyan could you provide a brief update on this issue? (I know you're waiting for a review but we just want to keep the status up-to-date here)

akhaleghi avatar Sep 16 '22 17:09 akhaleghi

Hi @akhaleghi, sure. I have created a PR to address the NC population issue. Waiting for the review process.

priyakalyan avatar Sep 17 '22 03:09 priyakalyan

Here is an update- second round of review done. The review process is still going on.

priyakalyan avatar Sep 30 '22 17:09 priyakalyan

This issue can be closed now! Here is a link to the csv file with the updated population, population density and area (in square miles) of all the 99 NCs.

priyakalyan avatar Nov 15 '22 16:11 priyakalyan

Anupriya, thank you for your persistence on this very challenging issue!

nichhk avatar Nov 15 '22 22:11 nichhk