zed icon indicating copy to clipboard operation
zed copied to clipboard

wrccdc "delete -where" lake operation sees 43% perf hit at super#5502 changes

Open philrz opened this issue 1 year ago • 0 comments

In the SuperDB data lake operations run in Autoperf, the step change shown here in the wrccdc "delete -where" queries is correlated with the arrival of the changes in https://github.com/brimdata/super/pull/5502.

Image

Details

The run time of this lake maintenance operation went from 1.29 minutes to 1.86 minutes, which represents a 43% performance hit. As usual when these changes in the macro trend are observed, it's likely that the nature of the changes makes it such that this is a totally appropriate cost for gaining the other benefits of the change. However, since an anticipated perf hit was not called out in the PR notes, I've opened this issue to document the transition and in case anyone wants to look closer.

The exact operation against the full wrccdc data set looks like:

super db delete -use wrccdc -where '_path=="loaded_scripts"'

At the moment Autoperf is still running against the SuperDB data lake, so the lake is still using sequential runtime against BSUP data.

philrz avatar Jan 15 '25 23:01 philrz