Support for datetime64
Hi @jbednar thank you for writing such a useful library... It has so much potential...
Are there any plans to support numpy's datetime64 datatype, or pandas DatetimeIndexes more generally? When I add the following lines to examples/tseries.ipynb
from datetime import datetime
startdt, enddt = [datetime.fromtimestamp(d) for d in start, end]
ts = pd.date_range(startdt, periods=n, freq=(enddt - startdt) / n)
df['ts'] = ts.values
and then choose to plot 'ts' on the x-axis:
cvs = ds.Canvas(x_range=x_range, y_range=y_range, plot_height=300, plot_width=900)
aggs= OrderedDict((c, cvs.line(df, 'ts', c)) for c in cols)
img = tf.shade(aggs['a'])
I get the following error:
/home/nick.tomlinson/anaconda2/lib/python2.7/site-packages/datashader-0.4.0-py2.7.egg/datashader/glyphs.pyc in validate(self, in_dshape) 25 def validate(self, in_dshape): 26 if not isreal(in_dshape.measure[self.x]): ---> 27 raise ValueError('x must be real') 28 elif not isreal(in_dshape.measure[self.y]): 29 raise ValueError('y must be real') ValueError: x must be real
'ts' in this case is a datetime64:
<class 'pandas.core.frame.DataFrame'> RangeIndex: 100000 entries, 0 to 99999 Data columns (total 12 columns): Time 100000 non-null float64 a 100000 non-null float64 b 100000 non-null float64 c 100000 non-null float64 d 100000 non-null float64 e 100000 non-null float64 f 100000 non-null float64 g 100000 non-null float64 x 100000 non-null float64 y 100000 non-null float64 z 100000 non-null float64 ts 100000 non-null datetime64[ns] dtypes: datetime64ns, float64(11) memory usage: 9.2 MB
Many thanks, Nick
We would love to add support for datetime64, which has already been discussed in a separate issue (#218). However, we don't currently have any manpower to try to implement that. If someone is interested in working on it, I'd be happy to review a PR. Meanwhile, you can use the approach in https://anaconda.org/jbednar/holoviews_datashader of converting to int64 and then reinterpreting it for display, which is awkward but works ok.
I think there is a fairly straightforward solution to this (although I can't speak to how easy it will be to implement based on the current organization of the code). We could simply convert datetime types to int64 on the input and then convert the coordinatess on the DataArray and ds.Image output back to a datetime type. That way the numba code never has to deal with datetime types and the cost of converting types should be pretty small.
Converting datetimes in general is a rabbit hole though, so perhaps to begin with we could just focus on datetime64[ns] types because we can just rely on numpy casting:
drange = np.array(pd.date_range(start="2014-01-01", end="2016-01-01", freq='1min'))
drange.astype('int64').astype('datetime64[ns]')
Sounds like a promising approach to me!
hey guys, any update on this front?