seaborn icon indicating copy to clipboard operation
seaborn copied to clipboard

Paths is surprisingly slow

Open mwaskom opened this issue 2 years ago • 0 comments

n = 5000
segments = []
for i in range(n):
    segments.append(
        np.column_stack([
            np.arange(10),
            np.linspace(0, i, 10)
        ])
    )

df = pd.concat([
    pd.DataFrame(xys, columns=["x", "y"]).assign(i=i)
    for i, xys in enumerate(segments)
], ignore_index=True)

Pure matplotlib version:

%%timeit -n 3
f = mpl.figure.Figure()
ax = f.subplots()
ax.add_collection(mpl.collections.LineCollection(segments))
ax.autoscale_view()
f.savefig(io.BytesIO(), dpi=100)
300 ms ± 8.12 ms per loop (mean ± std. dev. of 7 runs, 3 loops each)

Seaborn version:

%%timeit -n 3
so.Plot(df, "x", "y", group="i").add(so.Paths()).save(io.BytesIO(), dpi=100)
1.22 s ± 240 ms per loop (mean ± std. dev. of 7 runs, 3 loops each)

There is an additional cost to adding colors that is not present in the matplotlib version

%%timeit -n 3
f = mpl.figure.Figure()
ax = f.subplots()
ax.add_collection(mpl.collections.LineCollection(segments, array=np.arange(n)))
ax.autoscale_view()
f.savefig(io.BytesIO(), dpi=100)
307 ms ± 5.95 ms per loop (mean ± std. dev. of 7 runs, 3 loops each)

vs.

%%timeit -n 3
so.Plot(df, "x", "y", color="i").add(so.Paths()).save(io.BytesIO(), dpi=100)
3.24 s ± 264 ms per loop (mean ± std. dev. of 7 runs, 3 loops each)

mwaskom avatar Jun 27 '22 23:06 mwaskom