isds2020
isds2020 copied to clipboard
Ex. 3.2.1
When we make a box plot on the probability of survival for men and women within each passenger class it does not turn out nice. When we try there is no ‘second class’ and the plot is not informative. Our code is:
sns.boxplot(x='class', y='survived', hue='sex', data=titanic, ax=ax[1])
hi @theaiuel ,
the reason that your plot turns out weird is that the survived variable is a dummy variable.
As you can see in the screenshot below the descriptive statistics for the survived variable conditioned on the variables class and sex are not that useful. The boxplot basically visualizes these measures and as such the plot is not that informative.
The barplot is more useful in this case.
"A bar plot represents an estimate of central tendency for a numeric variable with the height of each rectangle and provides some indication of the uncertainty around that estimate using error bars." https://seaborn.pydata.org/generated/seaborn.barplot.html
As the mean is quite informative for a dummy variable this is the plot type to use here :)
