plotly.py
plotly.py copied to clipboard
Drawing lines / `add_shape()` is very slow, possible quadratic Schlemiel the Painter algorithm
To reproduce: Create lines.py
as follows:
import plotly.graph_objects as go
import plotly.express as px
import time
import random
N = [50, 100, 200, 400, 800]
def plot_random_lines(n):
fig = go.Figure()
for i in range(n):
c = [random.random() for _ in [0, 1, 2, 3]]
fig.add_shape(type='line', x0=c[0], y0=c[1], x1=c[2], y1=c[3])
# We don't show the figure to avoid any possible influence from the
# graphics driver.
def timings():
t_cum = []
for n in N:
t0 = time.process_time_ns()
plot_random_lines(n)
t_cum.append((time.process_time_ns() - t0) / 1e6)
t_per_line = [t/n for (t, n) in zip(t_cum, N)]
fig1 = px.scatter(x=N, y=t_cum, labels={'x': 'Number of lines', 'y': 'Cumulative time [ms]'})
fig1.show()
fig2 = px.scatter(x=N, y=t_per_line, labels={'x': 'Number of lines', 'y': 'Time per line [ms]'})
fig2.show()
timings()
Install plotly and run the above example.
- Expected: Draws the lines in a few milliseconds
- Actual: It takes more than half a minute on a modern MacBook
Notice that the time per line increases linearly with the number of lines drawn.
This looks like a classic example of a Schlemiel the painter algorithm, candidate for Joel Spolsky's collection.
Observations
I suspect that the following code locations are related to the bug.
- In https://github.com/plotly/plotly.py/blob/master/packages/python/plotly/plotly/basedatatypes.py#L5310,
curr_val
increases in length with each call toadd_shape()
. - In https://github.com/plotly/plotly.py/blob/master/packages/python/plotly/_plotly_utils/basevalidators.py#L2553,
v
increases in length with each call.
#2840 allows you to add lines using a different call that is much faster. Was mentioned here in the forum.
I ran into the same issue trying to add hundreds of finite rectangular annotations to a scatter plot. This is not possible with #2840 because the PR only addresses lines and infinite rectangles.
Found the update_layout
method in Stack here. It significantly sped up my plotting.
Here's some example code (untested in current form):
import pandas as pd
import plotly.graph_objects as go
fig = go.Figure()
rectangles = [
dict(
type="rect",
x0=row["x0"],
y0=row["y0"],
x1=row["x1"],
y1=row["y1"],
line=dict(color="RoyalBlue", width=2),
fillcolor="LightSkyBlue",
opacity=0.2,
)
for idx, row in rectangle_dataframe.iterrows()
]
shapes = [go.layout.Shape(r) for r in rectangles]
fig.update_layout(shapes=shapes)