introduction-datascience-python-book
introduction-datascience-python-book copied to clipboard
Introduction to Data Science: A Python Approach to Concepts, Techniques and Applications
In the first table in Chapter 2, Section: Manipulating data there is a claim that pandas std() is an unbiased standard deviation. That is not correct, std() is a sample...
It appears that recent updates to Pandas, as well as other libraries have caused some of the notebooks to break.
As per the documentation, the metrics.r2_score expected 2 parameters, the first should be the true values 'y' and the second should be the predicted values 'y_hat'.
Following the cleaning of the ice data set, the fig. drawn for the year-extent relation should have been labeled as year-extent instead of month-extent.
In section 3.3.3 about Outliers Treatment it suggests that we can clean up values that exceed the median by 2 or 3 deviation standard: ```python df2 = df.drop( df.index[(df.income =='>50K\n')...