plotly.py
plotly.py copied to clipboard
Plot DataFrame with TimedeltaIndex (x-axis)
Hello,
Plotly.py doesn't display x axis well when using a TimedeltaIndex
import pandas as pd
from pandas.compat import StringIO
import plotly
from plotly.graph_objs import Scatter, Layout
dat = """millis;ax;ay;az
544;-5.65;-0.39;7.45
550;-4.79;-0.59;7.26
556;-4.79;-1.33;6.79
562;-0.63;-1.33;9.53
579;-4.63;0.16;7.96
599;-5.45;0.35;7.34
618;-5.18;1.88;3.77
637;-6.12;-2.00;9.92
658;-3.80;0.51;9.02
677;-4.35;0.04;9.53
697;-3.88;0.71;8.79
717;-4.86;-0.43;8.83
741;-4.16;-1.06;8.79
756;-3.57;0.31;7.92
777;-2.71;2.79;8.32
796;-2.43;5.53;10.55
816;-2.75;3.30;8.67
835;-2.12;2.47;7.85
856;-2.04;2.63;7.85
875;-2.31;2.31;8.04
894;-2.00;3.37;8.12
922;0.86;7.69;9.65
942;-1.45;5.26;8.75
961;-1.96;4.35;8.04
985;-1.80;3.77;8.36
1001;-1.61;3.10;8.55"""
df = pd.read_csv(StringIO(dat), sep=';')
df['millis'] = pd.to_timedelta(df['millis'], unit='ms')
df = df.set_index('millis')
df.index = df.index + pd.Timestamp("1970/01/01")
print(df)
print(df.dtypes)
print(df.index)
plotly.offline.plot({
"data": [
Scatter(x=df.index, y=df['ax'])
],
"layout": Layout(
title="DataFrame with %s" % str(type(df.index))
)
})
When using TimedeltaIndex I get

x-axis have odd values starting with P0D 0H0M0.
but when using a DatetimeIndex I get

It will be great if Plotly.py could handle TimedeltaIndex without hassle.
Kind regards
+1
Does anyone have any suggestions on how the timedelta values should be labeled on the axes? Or some examples of how other plotting libraries handle this?
it can be: hh:mm:ss.000000000 hh is hour, mm is minute, ss is second, nine "0"s are for nanoseconds. the underlying of timedelta 64 is int
Timedelta values could be labeled on the axes the same manner than Timestamp (which are correctly labeled) ... but without date.
On my side I only need millisecond resolution but I can perfectly understand that it can be different for people who need microsecond or nanosecond resolution
Maybe this Pandas issue about formatting Timedelta to string https://github.com/pandas-dev/pandas/issues/26897 should also be considered.
Below 24h, Timedelta could probably be written using n days hh:mm:ss.xxx
Be also aware that the current way negative Timedelta are formatted with Pandas (and Python) are not really "human" readable https://github.com/pandas-dev/pandas/issues/17232
When events resolution is only millisecond, we probably don't need to display Timestamp with microsecond or nanosecond resolution.
Any update on this? I'm using plotly express to look at histograms of TimeDelta objects and the X-axis labeling is basically useless. I have to manually convert to hours, minutes, etc.
If your organization has a software budget and needs this feature, you can prioritize & sponsor the development of it by reaching out to our team: https://plot.ly/products/consulting-and-oem/. Much of our development is funded this way.
Otherwise, we'll update this issue when it's planned for an upcoming release. There are no updates at this moment and it is still a good idea.
I've tried to fix this in different ways within my scratch plotly.py space but it always comes down to how Timedelta wants to be treated in plotly.js as a numeric entity and not as a weird date or string value.
What's happening now when using a Timedelta series or index, the underlying data representations are getting written out as raw numerical values and the time unit information is lost. The values just show up in plotly.js as numbers. With extra work, you can cause the timedeltas to be printed out as strings, but then you lose the niceness of a numerical graph (interpolation, scaling, etc).
What would be really nice would be to let timedeltas write themselves out numerically much like they already are, let that numeric value get passed around like it already is but on the plotly.js side, it would be nice if we could have provided a Timedelta specific print format specifier. That way axis labels and hovers print in a timedelta iso format while still being expressed numerically underneath the covers.
I'm suggesting all that's needed is a Timedelta specific print format specifier. Maybe something that uses %t and %T where the underlying units can be either assumed or tacked on as a modifier, "%t!s" as an idea. More format expression would be nice if possible, the more like date formatting the better if possible, but these are just quick ideas for syntax examples.
There's just no way to use Timedelta as an axis cleanly without something like this. It either gets turned into weird datetimes or you have to turn them all into strings.
If you only want the time and no days, you can still hack timedelta somewhat by turning them all into actual datetimes:
df.index = df.index + pd.Timestamp("1970/01/01")
and then using a tick format specifier to print only the time portion:
"layout": Layout(
title="DataFrame with %s" % str(type(df.index))
xaxis_tickformat = '%X.%LS',
)
Having said all of this, and already wished for more than just iso format, something even more flexible like embedded js or some client side callback might find even more uses: %@func_name
But that's pie in the sky dreaming.
For reference, there are two other issues closely related to this one, covering the y axis (#801), and color axis (#3368). It would probably make sense to tackle all three issues holistically, rather than separately.
I'd like to see this, even if within plotly express they handle the change from the timedelta to datetime and then formating of the xaxis tick format.
Hi, any update on this issue?
I'm also looking forward for this issue to be resolved. In my case, I want to plot a gantt-chart-like bar plot, and in my case the x-axis goes from one time to another - which in a bar plot is given by a timedelta. px.timeline can handle that somehow -but there, it defines the start and end points, instead of start and duration. It is much more limited though.
edit: I have compared the html generated by px.timeline, which handle correctly the dates. I managed to do a workaround with 2 simple modifications on my barplot:
- adding a type:"date" to the xaxis: fig['layout']['xaxis'].update(dict( type="date" ))
- dividing the timedelta by 1 million (seems that the time units are the problem here): x=(df['Finish']-df['Start'])/1000000
Now I have a decent range and tick labels on the x axis
I also have this issue. I have some timeseries that are a few years long, but I would like the timeaxis as time since a specific date. It would also enable me to plot several timeseries together to compare, while these have different start dates in calendar time. The tick labels should be days or 10s of days or 100s of days. When I zoom in, it could switch to days+hh:mm:ss.xxxx.
Hi @mhangaard We are in the process of cleaning up old issues and seeing how to move forward with them. We hope to have an answer in a few weeks.