ENH: Allow to plot weighted KDEs.
Feature Type
-
[X] Adding new functionality to pandas
-
[ ] Changing existing functionality in pandas
-
[ ] Removing existing functionality in pandas
Problem Description
The current implementation does not currently allow to plot weighted KDEs.
Feature Description
Estimation of the PDF is currently done via scipy.stats.gaussian_kde which allows for a parameter weights.
pandas.DataFrame.plot.kde should accept this parameter as well.
Alternative Solutions
Here allow to pass a parameter weights to scipy.stats.gaussian_kde.
Additional Context
No response
Hello, I am working on it.
I updated the following
https://github.com/fbourgey/pandas/blob/feature-plot-weighted-kde/pandas/plotting/_core.py#L1449 https://github.com/fbourgey/pandas/blob/feature-plot-weighted-kde/pandas/plotting/_matplotlib/hist.py#L266
The code works.
Should we add one example in the function kde with the parameter weights?
Does this function need to be updated as well?
The following code gives
s = pd.Series([1, 2, 2.5, 3, 3.5, 4, 5])
ax = s.plot.kde()
Replacing with some weights produces
weights = pd.Series([0.1, 0.0, 0.0, 0.2, 0.3, 0.4, 0.9])
ax = s.plot.kde(weights=weights)
Using a Numpy Array works as well
weights = np.array([0.1, 0.4, 0.0, 0.2, 0.3, 0.4, 0.2])
However, passing a list instead
weights = [0.1, 0.4, 0.0, 0.2, 0.3, 0.4, 0.2]
raises the following error
File "/Users/florianbourgey/projects/misc/pandas_gaussian_kde.py", line 7, in <module>
ax = s.plot.kde(weights=weights)
File "/Users/florianbourgey/projects/pandas/pandas/plotting/_core.py", line 1567, in kde
return self(kind="kde", bw_method=bw_method, weights=weights, ind=ind, **kwargs)
File "/Users/florianbourgey/projects/pandas/pandas/plotting/_core.py", line 1049, in __call__
return plot_backend.plot(data, kind=kind, **kwargs)
File "/Users/florianbourgey/projects/pandas/pandas/plotting/_matplotlib/__init__.py", line 71, in plot
plot_obj.generate()
File "/Users/florianbourgey/projects/pandas/pandas/plotting/_matplotlib/core.py", line 500, in generate
self._make_plot(fig)
File "/Users/florianbourgey/projects/pandas/pandas/plotting/_matplotlib/hist.py", line 168, in _make_plot
kwds["weights"] = type(self)._get_column_weights(self.weights, i, y)
File "/Users/florianbourgey/projects/pandas/pandas/plotting/_matplotlib/hist.py", line 202, in _get_column_weights
weights = weights[~isna(y)]