specifying bins argument of sns.histplot as bin edges of a datetime type
seaborn version : '0.11.0'
I can produce a histogram of dates using bins=number of bins with no problem:
sns.histplot(data=df['visit_date'],bins=20
I cannot seem to specify the bin edges as a date type:
sns.histplot(data=df['visit_date'],bins = np.arange("2000", "2020", dtype="datetime64[D]")
In [57]: sns.histplot(data=df['visit_date'],bins= np.arange("2000", "2020", dtype="datetime64[D]"))
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-57-e11b4de76ee6> in <module>
----> 1 sns.histplot(data=df['visit_date'],bins= np.arange("2000", "2020", dtype="datetime64[D]"))
/data2/yelena/miniconda3/lib/python3.7/site-packages/seaborn/distributions.py in histplot(data, x, y, hue, weights, stat, bins, binwidth, binrange, discrete, cumulative, common_bins, common_norm, multiple, element, fill, shrink, kde, kde_kws, line_kws, thresh, pthresh, pmax, cbar, cbar_ax, cbar_kws, palette, hue_order, hue_norm, color, log_scale, legend, ax, **kwargs)
1433 estimate_kws=estimate_kws,
1434 line_kws=line_kws,
-> 1435 **kwargs,
1436 )
1437
/data2/yelena/miniconda3/lib/python3.7/site-packages/seaborn/distributions.py in plot_univariate_histogram(self, multiple, element, fill, common_norm, common_bins, shrink, kde, kde_kws, color, legend, line_kws, estimate_kws, **plot_kws)
434
435 # Do the histogram computation
--> 436 heights, edges = estimator(observations, weights=weights)
437
438 # Rescale the smoothed curve to match the histogram
/data2/yelena/miniconda3/lib/python3.7/site-packages/seaborn/_statistics.py in __call__(self, x1, x2, weights)
369 """Count the occurrances in each bin, maybe normalize."""
370 if x2 is None:
--> 371 return self._eval_univariate(x1, weights)
372 else:
373 return self._eval_bivariate(x1, x2, weights)
/data2/yelena/miniconda3/lib/python3.7/site-packages/seaborn/_statistics.py in _eval_univariate(self, x, weights)
350 density = self.stat == "density"
351 hist, _ = np.histogram(
--> 352 x, bin_edges, weights=weights, density=density,
353 )
354
<__array_function__ internals> in histogram(*args, **kwargs)
/data2/yelena/miniconda3/lib/python3.7/site-packages/numpy/lib/histograms.py in histogram(a, bins, range, normed, weights, density)
876 for i in _range(0, len(a), BLOCK):
877 sa = np.sort(a[i:i+BLOCK])
--> 878 cum_n += _search_sorted_inclusive(sa, bin_edges)
879 else:
880 zero = np.zeros(1, dtype=ntype)
/data2/yelena/miniconda3/lib/python3.7/site-packages/numpy/lib/histograms.py in _search_sorted_inclusive(a, v)
459 """
460 return np.concatenate((
--> 461 a.searchsorted(v[:-1], 'left'),
462 a.searchsorted(v[-1:], 'right')
463 ))
TypeError: invalid type promotion
Please turn this into a reproducible sample (a simple fake dataset should suffice), thanks.
Reproducible snippet:
dffake=pd.DataFrame(pd.date_range(start = '2012-01-01',end = '2019-01-01',freq='D'),columns=['date']).sample(10)
bins = pd.date_range(start = '2012-01-01',end = '2019-01-01',freq='7D')
The following all fail:
sns.histplot(data=dffake.date,bins=bins)
sns.histplot(data=dffake.date,bins=bins.astype('datetime64[ns]'))
sns.histplot(data=dffake.date,bins=np.array(bins.astype('datetime64[ns]')))
sns.histplot(data=dffake.date,bins=bins.to_pydatetime())
Thanks!
This happens because at the time the histogram is computed, the datetime data are represented as numeric values, but bins gets passed straight through to numpy, and so you end up with numeric values and datetime bins and it does not make sense.
In principle, this is not difficult to solve, but doing so will be annoying in that bins is a very flexible argument, and most specifications (e.g. a number, a string) should not have any conversion happen.
BTW, I imagine that we'll run into the same problem with binwidth and binrange.
Fortunately it's easy to workaround in user-space by doing:
sns.histplot(data=dffake.date, bins=mpl.dates.date2num(bins))
Ah, thank you for the explanation and the easy workaround. The solution has been difficult to track down. Thanks!
While the workaround here isn't extremely obvious, I think it's pretty simple once you know what to do, and it looks like supporting bins-with-original-units would be rather complex. So I think I'm going to close with no action for now.