Python-Data-Wrangling-Legacy
Python-Data-Wrangling-Legacy copied to clipboard
D-Lab's 3 hour introduction to data wrangling in Python. Learn how to import and manipulate dataframes using pandas in Python.
` Click the "Launch" button under "Jupyter Notebooks" and navigate through your file system to the Python-Data-Visualization folder you downloaded above. ` Should read Python-Data-Wrangling instead of Data Visualization
Some of the participants asked if there could be more time spent on `groupby()`, as they need it for their research. I feel it's important to spend at least 20'...
We might want to add a little section at the start of the notebook to explain the difference between DataFrame and Series objects.
Using `.round()` in Manipulating Columns needs to be explained. If we don’t use it, we will not get to a full integer and then calling `int()` will truncate. Compare `((unemployment['year_month']...
The challenge 12 solution seems too complex. We can just do `.dropna` on our newly created `ps` column?
`unemployment_rate_missing = unemployment[unemployment['unemployment_rate'].isnull()]` -> no need for double subset
`unemployment_rate` is a bit of a weird name (as it’s not actual unemployment rates but a DF with null proportions for unemployment rates
Understand what it is that we’re looking at in the data is another useful skill - that i think most social science / stem grad students have, but for people...
Another thing that could be useful is using filtering functions along with groupby
Possible things to consider for intermediate/advanced pandas class: - [ ] reading/writing to formats other than csv - [ ] reading data in by chunks - [ ] multi-indexing and...