online icon indicating copy to clipboard operation
online copied to clipboard

Generic Calc performance ticket

Open mmeeks opened this issue 1 year ago • 77 comments

Editing just now on staging with a larger group we got a number of problems with calc performance that should be improved.

perf-23 05-calc

Oddly we were getting large memory consumption for a smallish spreadsheet too - perhaps related to clipboads:

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 32675 cool 20 0 20.003t 2.904g 343088 R 103.0 9.263 6:31.37 kitbroker_012

which is really unexpected.

mmeeks avatar Jul 13 '23 10:07 mmeeks

Profile seems to suggest that we are serializing some vast area - quite probably a whole spreadsheet selection - and that this is creating lots of columns and so on - which is - far from great.

mmeeks avatar Jul 13 '23 10:07 mmeeks

Lets get the document from Pedro & can you look at this @eszkadev ?

mmeeks avatar Jul 13 '23 13:07 mmeeks

I cannot reproduce case where getTextSelection takes significant piece of a profile. Tested with large selections and others modifying the content.

I think in this ticket we see long-term profile where most of the operations were selection changes, people were clicking here and there without typing or any other actions - so that is why we see it bigger.

In my profiles where editing happens bigger problem is rendering of the tiles or full column selection.

eszkadev avatar Jul 14 '23 06:07 eszkadev

Hmm - I strongly suspect the memory blow-out and swapping which matches that profile and the very significant memory use can only come from that. I imagine that it is specific to Pedro's document which oddly has all 16k columns allocated for me too (causing all manner of problems =)

mmeeks avatar Jul 14 '23 07:07 mmeeks

wrt:

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 32675 cool 20 0 20.003t 2.904g 343088 R 103.0 9.263 6:31.37 kitbroker_012

the startling virt num indicates an asan build, for a fresh blank spreadsheet under asan I get

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 11532 cool 20 0 20.003t 2.763g 318000 S 0.000 8.816 0:05.36 kitbroker_003

before we do anything.

caolanm avatar Aug 09 '23 13:08 caolanm

I think this was it :

pedropintosilva avatar Aug 11 '23 08:08 pedropintosilva

This weeks calc test looks like: perf-calc

caolanm avatar Aug 17 '23 11:08 caolanm

I suspect we should break out the performance problems here that can't be seen in the profile into their own tickets:

  • full-sheet invalidations left & right - it looked like that if someone changes eg. a row-height on another sheet, we get an invalidaton of any other sheets - possibly just a missing 'part' in the invalidation.
  • editing a long string into a non-wrapped cell that will later be cropped seems to generate a large invalidation to the bottom right of the sheet.
  • I saw quite a surprising number of tiles with no deltas on them in my view - I'd expect in a well behaved session to see deltas on nearly every tile, and/or yellow "no change deltas" queued up in the UI. Would be good to find what gets rid of those.

And bits we can see in the profile:

  • Document::drainQueue -> MessageQueue::pop which I imagine does a lot of TileQueue::removeCallbackDuplicate consumes a good chunk of time parsing, and (I guess) re-parsing things - as soon as the queue gets long enough: which requires a slow server and lots of people.
  • lok::Document::setView - ... - ScInputBarGroup::NumberOfLinesChanged ... ImplImage::loadStockAtScale seems particularly unhelpful to do each time we switch view and (presuambly) easy enough to avoid continually re-loading - but not much cost there.
  • Encouragingly ScgridWindow::DrawContent is a chunk of the time - prolly better to reduce that by reducing invalidations rather than micro-optimizing that; although the amount of time we use setting up and clearing new EditEngines which perhaps we could cache for the screen content is quite surprising.
  • DeltaGenerator::DeltaBitmapRow::initRow - seems to be the lions share of delta creation though not much of the time spent, possibly some 'likely' annotation on branches would help; quite possibly we should not bother allocating and cpoying new row data if the previous row was the same - although prolly asan distorts the cost of malloc/free.

I guess the most interesting bits are the user interaction / invalidation problems - we should get a machine setup with no asan for profiling I think.

mmeeks avatar Aug 17 '23 16:08 mmeeks

When I do:

  • full-sheet invalidations ... a row-height on another sheet, we get an invalidation of any other sheets I see the blinkenlights flash in the other sheet, but the 'invalidates' counts in the tile overlay remains the same. I wonder if that is more an artifact of the overlay flashing for changes on other sheets that don't affect the one visible under it?

But I also see:

  • With one window on a tab X with another window on tab Y. Editing in the first view on tab X and then moving cursor to another cells "kills the editview" and the other views are invalidated for the same area but on whatever tab they happen to be looking at, so tab Y gets invalidated in the same area. https://gerrit.libreoffice.org/c/core/+/155816 for a little chain of commits on that thought. Blinkenlights still flash for the invalidated area on the other sheet but tile-invalidation count doesn't change with those in place.

caolanm avatar Aug 17 '23 20:08 caolanm

This weeks flamegraph for calc perf-staging-calc-2023-08-24

caolanm avatar Aug 24 '23 10:08 caolanm

We should really look at that view switching cost: doc_setView should be the moral equivalent of a single pointer change it's strobed at very high frequency =) Instead in this profile it unzips, and de-compresses a (small) PNG - worth checking if it does it every time - but we should be able to save some 30%+ of our paintPartTile cost there I think :-)

It is also interesting how little the thread pool does to help with the compression and delta generation; I'd like to hope this is because calc is good at only invalidating one tile as we type into a cell - so that is done in the main-thread but ... quite possibly there's a snafu there too.

mmeeks avatar Aug 24 '23 16:08 mmeeks

This weeks flamegraph for calc

perf-8556

caolanm avatar Sep 28 '23 11:09 caolanm

This weeks calc's perf flamegraph

perf-22435

https://github.com/CollaboraOnline/online/issues/7337 seems to have worked and at least that has shrunk

caolanm avatar Oct 05 '23 10:10 caolanm

Clearly the Calc navigator is out of control in multiview documents wrt doc_setView triggering such much updates. I'll look into that.

caolanm avatar Oct 05 '23 16:10 caolanm

Two framegraphs this week: The first for the mostly inactive session until the crash which is visible in profiling perf-19099

The second one on the restart to end: perf-21240

caolanm avatar Oct 12 '23 11:10 caolanm

Interesting: the mouse event processing seems to have an un-wanted inferior main-loop from chart::ChartControlller::executeDlg_ObjectProperties_withoutUndoGuard - which needs to be async'd:

image

And of course the amount of work done in doc_setView - to update the UI / notebookbar is visible everywhere; I'll file a separate ticket for that.

mmeeks avatar Oct 12 '23 16:10 mmeeks

@grandinj This ticket is probably interesting for you with its weekly feed of profiling porn :-)

mmeeks avatar Oct 12 '23 17:10 mmeeks

related to @caolanm flamegraphs at comment https://github.com/CollaboraOnline/online/issues/6893#issuecomment-1759411348_ I have created https://github.com/CollaboraOnline/online/pull/7430 to fix a small perf problem.

grandinj avatar Oct 12 '23 19:10 grandinj

This weeks calc flamegraph. We're skewed by the threaded formula thing but expanding kitbroker_006 to look at that alone at least we don't have the ridiculous constant creation of a WeldedNotebookbar of last week noted as https://github.com/CollaboraOnline/online/issues/7428 though still too much going on in doc_setView

perf-14321

caolanm avatar Oct 19 '23 11:10 caolanm

pre-crash calc profile:

perf-19984

looks like:

a) MergeCells is not an async dialog b) crash is visible as coming from ScDocShell::MoveTable

post-crash calc profile:

perf-21534

The problem of https://github.com/CollaboraOnline/online/issues/7428 is not totally gone, but the cost of it is less now without the notebookbar torn down and recreated for dark mode switching so there seems likely a relation to that in the earlier profiling

caolanm avatar Oct 26 '23 11:10 caolanm

perf-15373

Got an apparent modal SvxCharacterMap dialog, some hefty CalendarWrapper work and spell check work.

caolanm avatar Nov 02 '23 12:11 caolanm

(1) I note that hunspell is spending a lot of time just calling clock(), which seems odd. Perhaps it should be using clock_gettime(CLOCK_MONOTONIC) instead, to use the fancy mapping trick that Linux does and avoid a syscall.

(2) we seem to be spending a lot of time just figuring out which dictionary to use, not even doing actual dictionary loading or actual spell-checking. Surely that should be cached somewhere?

grandinj avatar Nov 02 '23 12:11 grandinj

possible fix for the hunspell issue is here: https://gerrit.libreoffice.org/c/core/+/158809

grandinj avatar Nov 02 '23 12:11 grandinj

https://gerrit.libreoffice.org/c/core/+/158830 for the i18npool::Calendar_jewish::mapFromGregorian issue, its nearly always 1999-12-31 so we can just add a fast path for that

caolanm avatar Nov 02 '23 13:11 caolanm

This weeks calc profile: perf-6686

caolanm avatar Nov 16 '23 12:11 caolanm

For reference in today's bespoke tests: First session: perf-27903

@hcvcastro The ScDBData::ExtendBackColorArea is a pretty recent new method I believe, since https://cgit.freedesktop.org/libreoffice/core/commit/?id=768433f07873eb608837630f85e7e1b375239fca Seems to take up quite a bit of time.

Second session: perf-29178

caolanm avatar Nov 17 '23 20:11 caolanm

perf-18345

This week

caolanm avatar Nov 23 '23 12:11 caolanm

IRC log in case someone wants to look at the ScPositionHelper thing, I dont know what is going on there:

void ScGridWindow::PaintTile( VirtualDevice& rDevice, - doing a big/slow __memset is unexpected - particularly that the size is larger than the pixel-size of the tile - which is quite large ...

  • mmeeks wonders if some big hash-table or STL type is fiddled with there. @noel_grandin: seems there is a ton of work done on HandleMouseEvent =) 12.5% of total time handling mouse events seems extraordinary =) a greath swath of GetRowHeight going on there - from the drawing-layer =) <noel_grandin> weird, if we are hitting GetRowHeight that means that the ScPositionHelper cache is not doing its job

grandinj avatar Nov 23 '23 14:11 grandinj

perf-15753

This weeks. No crashes, so 1h. ScTabViewShell::ExecuteTable has some non-async dialogs still

caolanm avatar Nov 30 '23 12:11 caolanm

The calloc in the graph is now at 5% and is unnecessary - no need to memset() there, the data will be completely override during the delta generation process.

grandinj avatar Nov 30 '23 12:11 grandinj