plotly.py icon indicating copy to clipboard operation
plotly.py copied to clipboard

Plot DataFrame with TimedeltaIndex (x-axis)

Open s-celles opened this issue 8 years ago • 14 comments

Hello,

Plotly.py doesn't display x axis well when using a TimedeltaIndex

import pandas as pd
from pandas.compat import StringIO
import plotly
from plotly.graph_objs import Scatter, Layout

dat = """millis;ax;ay;az
544;-5.65;-0.39;7.45
550;-4.79;-0.59;7.26
556;-4.79;-1.33;6.79
562;-0.63;-1.33;9.53
579;-4.63;0.16;7.96
599;-5.45;0.35;7.34
618;-5.18;1.88;3.77
637;-6.12;-2.00;9.92
658;-3.80;0.51;9.02
677;-4.35;0.04;9.53
697;-3.88;0.71;8.79
717;-4.86;-0.43;8.83
741;-4.16;-1.06;8.79
756;-3.57;0.31;7.92
777;-2.71;2.79;8.32
796;-2.43;5.53;10.55
816;-2.75;3.30;8.67
835;-2.12;2.47;7.85
856;-2.04;2.63;7.85
875;-2.31;2.31;8.04
894;-2.00;3.37;8.12
922;0.86;7.69;9.65
942;-1.45;5.26;8.75
961;-1.96;4.35;8.04
985;-1.80;3.77;8.36
1001;-1.61;3.10;8.55"""

df = pd.read_csv(StringIO(dat), sep=';')
df['millis'] = pd.to_timedelta(df['millis'], unit='ms')
df = df.set_index('millis')

df.index = df.index + pd.Timestamp("1970/01/01")

print(df)
print(df.dtypes)
print(df.index)

plotly.offline.plot({
"data": [
    Scatter(x=df.index, y=df['ax'])
],
"layout": Layout(
    title="DataFrame with %s" % str(type(df.index))
)
})

When using TimedeltaIndex I get

capture d ecran 2017-07-22 a 12 09 19

x-axis have odd values starting with P0D 0H0M0.

but when using a DatetimeIndex I get

capture d ecran 2017-07-22 a 12 16 44

It will be great if Plotly.py could handle TimedeltaIndex without hassle.

Kind regards

s-celles avatar Jul 22 '17 10:07 s-celles

+1

ethanopp avatar Apr 25 '18 06:04 ethanopp

Does anyone have any suggestions on how the timedelta values should be labeled on the axes? Or some examples of how other plotting libraries handle this?

jonmmease avatar Sep 22 '18 16:09 jonmmease

it can be: hh:mm:ss.000000000 hh is hour, mm is minute, ss is second, nine "0"s are for nanoseconds. the underlying of timedelta 64 is int

yjyytyuiy avatar Oct 17 '18 12:10 yjyytyuiy

Timedelta values could be labeled on the axes the same manner than Timestamp (which are correctly labeled) ... but without date. On my side I only need millisecond resolution but I can perfectly understand that it can be different for people who need microsecond or nanosecond resolution

s-celles avatar Jun 19 '19 12:06 s-celles

Maybe this Pandas issue about formatting Timedelta to string https://github.com/pandas-dev/pandas/issues/26897 should also be considered.

Below 24h, Timedelta could probably be written using n days hh:mm:ss.xxx

Be also aware that the current way negative Timedelta are formatted with Pandas (and Python) are not really "human" readable https://github.com/pandas-dev/pandas/issues/17232

When events resolution is only millisecond, we probably don't need to display Timestamp with microsecond or nanosecond resolution.

s-celles avatar Jun 19 '19 13:06 s-celles

Any update on this? I'm using plotly express to look at histograms of TimeDelta objects and the X-axis labeling is basically useless. I have to manually convert to hours, minutes, etc.

benjaminjack avatar Oct 26 '19 21:10 benjaminjack

If your organization has a software budget and needs this feature, you can prioritize & sponsor the development of it by reaching out to our team: https://plot.ly/products/consulting-and-oem/. Much of our development is funded this way.

Otherwise, we'll update this issue when it's planned for an upcoming release. There are no updates at this moment and it is still a good idea.

nicolaskruchten avatar Oct 29 '19 04:10 nicolaskruchten

I've tried to fix this in different ways within my scratch plotly.py space but it always comes down to how Timedelta wants to be treated in plotly.js as a numeric entity and not as a weird date or string value.

What's happening now when using a Timedelta series or index, the underlying data representations are getting written out as raw numerical values and the time unit information is lost. The values just show up in plotly.js as numbers. With extra work, you can cause the timedeltas to be printed out as strings, but then you lose the niceness of a numerical graph (interpolation, scaling, etc).

What would be really nice would be to let timedeltas write themselves out numerically much like they already are, let that numeric value get passed around like it already is but on the plotly.js side, it would be nice if we could have provided a Timedelta specific print format specifier. That way axis labels and hovers print in a timedelta iso format while still being expressed numerically underneath the covers.

I'm suggesting all that's needed is a Timedelta specific print format specifier. Maybe something that uses %t and %T where the underlying units can be either assumed or tacked on as a modifier, "%t!s" as an idea. More format expression would be nice if possible, the more like date formatting the better if possible, but these are just quick ideas for syntax examples.

There's just no way to use Timedelta as an axis cleanly without something like this. It either gets turned into weird datetimes or you have to turn them all into strings.

If you only want the time and no days, you can still hack timedelta somewhat by turning them all into actual datetimes:

df.index = df.index + pd.Timestamp("1970/01/01")

and then using a tick format specifier to print only the time portion:

"layout": Layout(
    title="DataFrame with %s" % str(type(df.index))
    xaxis_tickformat = '%X.%LS',
)

Having said all of this, and already wished for more than just iso format, something even more flexible like embedded js or some client side callback might find even more uses: %@func_name

But that's pie in the sky dreaming.

jwminton avatar Sep 04 '20 21:09 jwminton

For reference, there are two other issues closely related to this one, covering the y axis (#801), and color axis (#3368). It would probably make sense to tackle all three issues holistically, rather than separately.

harahu avatar Sep 02 '21 13:09 harahu

I'd like to see this, even if within plotly express they handle the change from the timedelta to datetime and then formating of the xaxis tick format.

S-Hanly avatar Dec 10 '21 21:12 S-Hanly

Hi, any update on this issue?

zivh2pro avatar Jan 16 '22 06:01 zivh2pro

I'm also looking forward for this issue to be resolved. In my case, I want to plot a gantt-chart-like bar plot, and in my case the x-axis goes from one time to another - which in a bar plot is given by a timedelta. px.timeline can handle that somehow -but there, it defines the start and end points, instead of start and duration. It is much more limited though.

edit: I have compared the html generated by px.timeline, which handle correctly the dates. I managed to do a workaround with 2 simple modifications on my barplot:

  • adding a type:"date" to the xaxis: fig['layout']['xaxis'].update(dict( type="date" ))
  • dividing the timedelta by 1 million (seems that the time units are the problem here): x=(df['Finish']-df['Start'])/1000000

Now I have a decent range and tick labels on the x axis

filipesmg avatar Apr 06 '22 13:04 filipesmg

I also have this issue. I have some timeseries that are a few years long, but I would like the timeaxis as time since a specific date. It would also enable me to plot several timeseries together to compare, while these have different start dates in calendar time. The tick labels should be days or 10s of days or 100s of days. When I zoom in, it could switch to days+hh:mm:ss.xxxx.

mhangaard avatar Apr 16 '24 05:04 mhangaard

Hi @mhangaard We are in the process of cleaning up old issues and seeing how to move forward with them. We hope to have an answer in a few weeks.

Coding-with-Adam avatar May 16 '24 15:05 Coding-with-Adam