funpymodeling
funpymodeling copied to clipboard
Create `freq_plot` function (frequency plot for categorical variables)
In funModeling, freq
functions plots the frequency for all the categorical variables.
Below there is a code that do something similar:
import seaborn as sns
import matplotlib.pyplot as plt
sns.set()
tips = sns.load_dataset("tips")
d_plot=tips
fig, ax = plt.subplots(4, 2, figsize=(20, 20))
for variable, subplot in zip(cat_vars(d_plot), ax.flatten()):
sns.countplot(y=d_plot[variable], ax=subplot, order = d_plot[variable].value_counts().index)
for label in subplot.get_xticklabels():
label.set_rotation(90)
It shows:
- This is not the case, but if the names are too long they overlap across the plots
- Don't create empty grids (calculate dynamically the number of plots)
- It needs to show the absolute and relative percetage per bar as it is shown below:
This data is already calculated by the function freq_tbl
in this package.
-
If there are more than 100 different categories, the plot should group in the
other
ormore
category, to avoid crashing. -
It should use the
todf()
function (from funpymodeling) to convert different datatypes to dataframe sofreq_plot
supports numpy 1D/2D, pandas series and 1D/2D lists
Can we use plotly? :P