plotly.py
plotly.py copied to clipboard
Plotting DataFrame with timedelta64 (y-axis)
Similar to #799 but on y-axis and https://github.com/pandas-dev/pandas/issues/16953
import pandas as pd
from pandas.compat import StringIO
import plotly
import plotly.graph_objs as go
dat = """c1,c2,c3
1000,2000,1500
9000,8000,1600"""
df = pd.read_csv(StringIO(dat))
df = df.apply(lambda x: pd.to_timedelta(x, unit='ms'))
print(df)
print(df.dtypes)
print(df.index)
trace1 = go.Bar(
x=df.index,
y=df.c1,
name='c1'
)
trace2 = go.Bar(
x=df.index,
y=df.c2,
name='c2'
)
trace3 = go.Bar(
x=df.index,
y=df.c3,
name='c3'
)
data = [trace1, trace2, trace3]
layout = go.Layout(
barmode='group'
)
plotly.offline.plot({
"data": data,
"layout": layout
})
displays
y-axis values are not correctly displayed
A workaround is to do:
for col in df.columns:
df[col] = df[col] + pd.to_datetime('1970/01/01')
but it will be nice if plotly.py could handle timedelta64
on y-axis
Is there any progress regarding this, I really need to use timedelta64 on Y-axis?
That's really not much of a workaround. Showing Jan 1, 1970 at the bottom... Timedelta is really a standard feature used all the time and plots with the y-axis being timedelta are very common.
For reference, there are two other issues closely related to this one, covering the x axis (#799), and color axis (#3368). It would probably make sense to tackle all three issues holistically, rather than separately.
Hey, it's a bit late to the party. But, I wrote a solution for this issue. You can have dash, including its iterative features working and a datetime format in any sense. This, includes the autorender to solve for many x_sample points, so that it won't crash your axis.
Such is the following:
For the x_axis make absolutely sure that the format of your list is list[str] and they are consistent. ALSO, match the format in those strings in the tickformat of your list, the reference for time is standard as used in datetime objects, for future reference check: https://plotly.com/python/reference/layout/xaxis/#layout-xaxis-tickformat
e.g.:
mock_list = ["00:00:00", "00:00:01"]
mock_list to be the x values in a scatter plot for instance then adjust the axis as follows:
fig.update_xaxes( tickformat="%H:%M:%S")
Thanks @ThomasGl. However, this works only for the last subplot in the figure.
The X axis is displayed in "%H:%M:%S"
.
The 3rd (bottom) subplot hover X data is in "%H:%M:%S"
.
But 1st and 2nd subplots hover X data is still in Jan 1, 1970, ...%H:%M:%S
. How to make them also %H:%M:%S
?
Tried
for xaxis in range(1, 4):
fig['layout'][f'xaxis{xaxis}']['tickformat'] = "%H:%M:%S.%f"
with no help.
Thanks @ThomasGl. However, this works only for the last subplot in the figure.
The X axis is displayed in
"%H:%M:%S"
. The 3rd (bottom) subplot hover X data is in"%H:%M:%S"
. But 1st and 2nd subplots hover X data is still inJan 1, 1970, ...%H:%M:%S
. How to make them also%H:%M:%S
?Tried
for xaxis in range(1, 4): fig['layout'][f'xaxis{xaxis}']['tickformat'] = "%H:%M:%S.%f"
with no help.
Hi. Ill take at look at it over the weekend, but can you share a bit more of information upon the issue you are having?
@dizcza also take note that you must adjust the axis for each subplot. As the engine responsible to generate the graphs renders each one as a new "fig" object with defaults params
Here is the code I'm using:
import plotly.graph_objects as go
from plotly.subplots import make_subplots
def add_traces(fig, record_data_dict: dict):
# only one key/value for now in this dict
for sensor, record_data in record_data_dict.items():
y = np.random.randn(1000, 3)
# convert s to ms
time_ms = (record_data.time * 1000).astype(np.int32)
td = pd.to_timedelta(time_ms, unit='ms') + pd.Timestamp("1970/01/01")
idx = np.arange(len(y)).astype(str)
for dim in range(3):
trace = go.Scatter(
x=td,
y=y[:, dim],
hovertext=idx,
name="AAA",
legendgroup=sensor,
showlegend=dim == 0,
marker=dict(color=colors[sensor]),
line=dict(color=colors[sensor]),
opacity=0.8
)
fig.add_trace(trace, row=dim + 1, col=1)
def plot_fig(record_dir=DATA_DIR / "2023.02.28"):
fig = make_subplots(rows=3, shared_xaxes=True)
record = Record(record_dir)
add_traces(fig, record.data)
fig['layout']['xaxis3']['title'] = "Time, s"
fig.update_layout(
title=record_dir.name,
legend_title="Sensor",
)
fig.update_xaxes(tickformat="%H:%M:%S.%f")
and here is the plot
The 1st and 2nd plots hover data is incorrect: it starts with Jan 1, 1970
.
@dizcza also take note that you must adjust the axis for each subplot. As the engine responsible to generate the graphs renders each one as a new "fig" object with defaults params
How can I do so? In my case, I have only one figure.
Each subplot renders the engine plot for figure, in the sense that you have as many fig objects as you have subplots, thus in your case you have 4 fig objects, One containing subplots and then 3 as you have 3 subplots.
As for starting in Jan 1, 1970. this is a standard initial date, in case of missing compiling data, meaning, if you don't have a "DD:MM:YYYY" string like in the element responsible to render it, check in the documentation for the dash plots in case it changed, or it has some slightly different format. This could be generated using a list comprehension.
Yet, as for correcting timestamp, pass the line with fig.update_xaxes
to the last line in the function add_traces
By the way, I can't know if theres an error with your data without the the function call arg to plot_fig, by that I mean that I need whatever DATA_DIR contains in order to recreate your plots
All right, here is fully reproducible code:
import numpy as np
import pandas as pd
import plotly.graph_objects as go
from plotly.subplots import make_subplots
def add_traces(fig):
# only one key/value for now in this dict
y = np.random.randn(1000, 3)
time_s = np.random.rand(len(y)).cumsum()
time_ms = (time_s * 1000).astype(np.int32)
td = pd.to_timedelta(time_ms, unit='ms') + pd.Timestamp("1970/01/01")
idx = np.arange(len(y)).astype(str)
for dim in range(3):
trace = go.Scatter(
x=td,
y=y[:, dim],
hovertext=idx,
name="AAA",
showlegend=dim == 0,
opacity=0.8
)
fig.add_trace(trace, row=dim + 1, col=1)
def plot_fig():
fig = make_subplots(rows=3, shared_xaxes=True)
add_traces(fig)
fig['layout']['xaxis3']['title'] = "Time, s"
fig.update_xaxes(tickformat="%H:%M:%S.%f")
fig.show()
if __name__ == '__main__':
plot_fig()
As for starting in Jan 1, 1970. this is a standard initial date, in case of missing compiling data, meaning, if you don't have a "DD:MM:YYYY" string like in the element responsible to render it, check in the documentation for the dash plots in case it changed, or it has some slightly different format. This could be generated using a list comprehension.
I understand that this is the standard initial date. But showing the date while hovering is not expected. I expect to have both the X axis and hover-on-data X values formatted to the %H:%M:%S
.
Yet, as for correcting timestamp, pass the line with fig.update_xaxes to the last line in the function add_traces
Tried with no luck.
Just try running this example and hover on the 1st, 2nd, and 3rd subplots, and you'll see the difference.
I see, you only desire the xaxes in the "%H:%M:%S" to show up?
Ill run it tomorrow night
I see, you only desire the xaxes in the "%H:%M:%S" to show up?
Correct. Not only the X axis (the bottom panel) but also X values when I hover the mouse over any subplot.
import numpy as np import pandas as pd import plotly.graph_objects as go from plotly.subplots import make_subplots
def add_traces(fig): # only one key/value for now in this dict y = np.random.randn(1000, 3) time_s = np.random.rand(len(y)).cumsum() time_ms = (time_s * 1000).astype(np.int32) td = [time[len("0 days "):] for time in pd.to_timedelta(time_ms, unit='ms').astype(str)] idx = np.arange(len(y)).astype(str) for dim in range(3): trace = go.Scatter( x=td, y=y[:, dim], hovertext=idx, name="AAA", showlegend=dim == 0, opacity=0.8 ) fig.add_trace(trace, row=dim + 1, col=1)
def plot_fig(): fig = make_subplots(rows=3, shared_xaxes=True) add_traces(fig) fig['layout']['xaxis3']['title'] = "Time, s" fig.update_xaxes(tickformat="%H:%M:%S.%f") fig.show()
if name == 'main': plot_fig()
@dizcza I am pretty sure this is what you were looking for? I didn't quite understand why you were adding pd.Timestamp("1970/01/01"), and be aware of the dash expect type for this operation to work.... it needs a List[str] object, where the string are already formatted.... e.g. for a "03:45:10" its expected a "%H:%M:%S"
@ThomasGl thanks this is promising but the X axis labeling looks weird and not so intuitive in my original example. I mean it's much easier to look at
than
I didn't quite understand why you were adding pd.Timestamp("1970/01/01")
Because if I don't, I'm getting this:
Just like the author of this issue reported. And he added pd.Timestamp("1970/01/01")
to workaround this. So do I.
Thanks for the effort though. I'm not sure which version I'll use: with "1970/01/01" upfront obfuscating the users or confusing X axis string labeling for each point.
Hmmm. I mean yes it does look overcrowded a bit. Bit its because of the densuty of your data. When and if you zoom in you would see it fits better, again I suggest you look in the plotly documentation for the function behavior of update_xaxes()
it might have some options on how to adjust the precision on which you see the xlabels. I am not sure how, as I didn't have to do it in my own projects.
Yet I hope I helped you understand a bit more and that you can carry on from here
The core challenge here is that Plotly's date/time axes can only today represent specific absolute instants in time (e.g. March 5, 2023 at 8:13am UTC), and hence are incompatible with relative timedelta
representations. Adding an absolute instant to such objects converts them to absolute instants, and by forcing the axis/hover displays to include only day-of-month/hour-of-day/minute etc information, you can hide the underlying absoluteness of the data point to an extent, but this has limits. For example if you add January 1, 1970 and your delta represents 32 days, then the "days" portion will be incorrectly displayed as 1 (i.e. February 1). More generally you will not be able to display times in formats like "200 minutes" or "26 hours and 4 minutes".
We are aware of these limitations in the library and would certainly undertake the development required to add relative time axes to the underlying Plotly.js library, but this would require external sponsorship.
If you are mostly concerned with the hoverlabel, you can use the following single line to set the hovertemplate
for all your traces to only include the h/m/s portion of the X value: fig.update_traces(hovertemplate="%{x|%H:%M:%S.%f}, %{y}")
If you are mostly concerned with the hoverlabel, you can use the following single line to set the
hovertemplate
for all your traces to only include the h/m/s portion of the X value:fig.update_traces(hovertemplate="%{x|%H:%M:%S.%f}, %{y}")
Thanks @nicolaskruchten, I had trouble with hovertemplate language in the past that's why I had been avoiding templates till you showed me how to use them, and your solution works like a charm.
With these two hacks in mind, adding pd.Timestamp("1970/01/01")
and hovertemplate="%{x|%H:%M:%S.%f}, %{y}"
, I was able to achieve what I want. At least from the user's perspective, all looks nice and shiny.
@juandering sure, here it is
td = pd.to_timedelta(time_ms, unit='ms') + pd.Timestamp("1970/01/01")
trace = go.Scatter(x=td, y=...)
fig = make_subplots(rows=3, shared_xaxes=True, vertical_spacing=0.03)
fig.add_trace(trace, row=1, col=1)
fig.update_xaxes(tickformat="%H:%M:%S.%f")
hovertemplate = "%{x|%H:%M:%S.%f}, %{y}<br>point=%{hovertext}"
fig.update_traces(hovertemplate=hovertemplate)
@juandering sure, here it is
td = pd.to_timedelta(time_ms, unit='ms') + pd.Timestamp("1970/01/01") trace = go.Scatter(x=td, y=...) fig = make_subplots(rows=3, shared_xaxes=True, vertical_spacing=0.03) fig.add_trace(trace, row=1, col=1) fig.update_xaxes(tickformat="%H:%M:%S.%f") hovertemplate = "%{x|%H:%M:%S.%f}, %{y}<br>point=%{hovertext}" fig.update_traces(hovertemplate=hovertemplate)
Many thanks @dizcza.