dime-python-training
dime-python-training copied to clipboard
feedback - session pandas May 13
when creating the first df manually, spend more time saying that this is a manual way that you typically do not use, but you want to start there to show the basics, and say that you will cover more things later
Maybe start with a slide that list ways you can create pandas, and then start with the manual way
read_stata() - maybe make a note that not all meta info in .dta such as labels etc. are supported in pandas, so one can only expect that the data will be read correctly, but maybe not the meta data
from @ccsuehara in chat:
Luckily, broken hearted people had made additional libraries to freely browse a dataframe as if it was stata! https://pypi.org/project/dtale/ for example this one
Mention this but maybe there is not time to cover it
about the read_stata() as well, labels are read as strings and that might be troublesome! just give a heads up
On the column label / column name slide, you tripped a little on the points you wanted to make. I think it was fine in the end, but there was room to add one more bullet point so maybe add one more point there
slide 30 - missing iloc in example
The .loc also indexes columns with their labels as well, for example crops.loc[:, ['Price]], basically we can subset anything with .loc
maybe clarify that it needs () when defining more than 1 condition because of how the operator orders evaluate. Will expand more on this
The most similar thing I've found for % and not repeating the df name is the .eval command https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.eval.html
we can also have the index as variable with df['ind'] = df.index