data-science
data-science copied to clipboard
CoP: Data Science: City of Los Angeles Evictions
Prerequisite(s)
If you would like to work on this issue, please add a comment below and include the following information:
- Your name
- How many hours you can commit to working on this in the next week (minimum of 2)
- Commit to providing an update with a comment before the next community of practice meeting
For example:
- John Doe
- I can commit to working on this issue 3 hours in the following week.
- Yes, I will provide an update on my progress with a comment below.
Once you have done this, please add yourself to the “Assignees” section on the right and update the issue weekly to document your progress.
Overview
We want to analyze eviction data for the city of Los Angeles, and incorporate data from other sources, to determine whether there are actions local leaders can take to address the problem. The following background information is from the LA Controller's website:
- August 1, 2023 – rent owed from March 1, 2020 to August 31, 2020 is due. If the Declaration of COVID-19-Related Financial Distress form was returned to the landlord within 15 days of rent being due, they cannot be evicted for nonpayment of rent.
- February 1, 2024 – rent owed from October 1, 2021 to January 31, 2023 is due. If a tenant returned the Declaration of COVID-19-Related Financial Distress form to the landlord within 15 days of rent being due AND paid 25% of rent owed from this period, they cannot be evicted for nonpayment of rent.
- However, since March 27, 2023, landlords may not evict a tenant who falls behind in rent unless the tenant owes an amount higher than the Fair Market Rent (FMR). The FMR depends on the bedroom size of the rental unit.
Action Items
Phase 1
- [x] Find available data sources and add to Resources section
- [x] Perform Exploratory Data Analysis (read more here
- [x] Create data dictionary (EDA task)
- [x] Perform data cleaning (EDA task)
- [x] Understand and outline data context
- [x] Determine is this is one-time or ongoing project (and assign appropriate label)
- [x] Write one-sheet (see Resources below)
- [x] Define stakeholder
- [x] Summarize project, including value add
- [x] Define project 6 month roadmap
- [x] Detail history (if any)
- [x] Define tools to be used for analysis and visualization (if applicable)
- [x] Create issues required to fulfill project requirements, including exploratory data analysis, required tasks, and deliverables
Resources/Instructions
Feb 2023 - July 2023 eviction data csv file Check #178 for updates on whether a real time source for this data have been found
perhaps another reason for the influx? would be interesting to explore https://www.wired.com/story/generative-ai-courts-law-justice/
Jane Diokpo I can commit to working on this issue 2 hours in the following week. Yes, I will provide an update on my progress with a comment below.
@JANEDIOKPO Thanks for volunteering, so the first steps would be to investigate what other sources are available to obtain eviction data as what has been found is only a subset (2023 data only) and then perform EDA on the data set.
@JANEDIOKPO Thanks for volunteering, so the first steps would be to investigate what other sources are available to obtain eviction data as what has been found is only a subset (2023 data only) and then perform EDA on the data set.
@akhaleghi akhaleghi Hi, I'm completely new to data science and trying to learn. I'd appreciate it if you could send some resources on how to do an EDA or find sources.
Hi @JANEDIOKPO I'm going to move this back to the backlog because there hasn't been any activity on the issue. Let me know if you'd like to work on it.
- Pranjali Seth
- I can commit to working on this issue 12 hours in the following week.
- Yes, I will provide an update on my progress with a comment below.
I have gathered the data set, analyzed and tried experimenting with a few EDA cleaning tasks
I worked for 6 hours the last week, here's the update -
The data set in itself is very less informative and it is hard to find any trends with the given variables against the target variable. I have therefore researched on other data sets on the LA Controller’s website to find the metadata or any supporting data that can be clubbed with the current dataset to find more concrete relationship with the target variable. I went through the following day sets - LA Homelessness expense tracker, LA Payroll Employee Residence Analysis, Cash for Keys and Affordable Housing Covenants. Out of these, the Cash for keys contains the information regarding the owners paying the residents to leave, which correlates to the owners dissatisfaction. If an area has a high dissatisfaction, it would be that people fail to pay rent or adhere to the society guidelines. The addresses in the dataset can be converted to zip codes using GeoPy and we can find the average buyout for that year and place and therefore find some relation with the eviction notices. I also found fair market rent(FMR) for Los Angeles from a different website (https://www.laalmanac.com/economy/ec40b.php ) along with looking for household income dataset for the LA County as well. Currently, need to discuss with the team, how to go ahead with the issue and whether to involve any other data sets with better variables or not.
I read several articles on the Los Angeles Evictions rules and laws before, during and post Covid-19 pandemic to get insights about the background information. Collected the Fair Market Rent(FMR) by zip codes data set and estimated population by zip codes data for the LA County. Merged the relevant datasets to the original data to find dependencies and trends between the datasets. Currently working with the population dataset to find useful insights on the eviction cases and intensity.
- Noel Thomas
- I can commit to working on this issue 15 hours this week.
- Yes, I will provide an update on my progress with a comment below, before the next CoP meeting.
So far, I have gathered meta data (FMR, Population) to add to the original eviction data set. I have merged the metadata and did its exploratory data analysis along with data cleaning. I am working on running couple models on the current data to observe the trends.
Rahul Iragavarapu I can commit to working on this issue 2 hours in the following week. I will provide an update on my progress with a comment below.
Provided all findings in the CoP meeting today. I will next be working on documentation for the issue checklist.
Working on the documentation. Completed some part of it so far. Also will be working on creating the presentation soon.
Uploading the google drive links for data sets and code
Google drive link - https://drive.google.com/drive/folders/1-yiJ-ZcC20wOlikNzG1zCn8vsX2H-VNt
Data and Metadata Sources - https://www.laalmanac.com/employment/em12c.php https://www.laalmanac.com/population/po24la_zip.php https://gist.github.com/erichurst/7882666