online UnitPerf CSV generation

We have a rudimentary performance unit test in tests/UnitPerf.cpp

It generates some numbers - and histogram and so on - which are pretty;

But we want to start capturing and logging that data in a form we can easily re-use, consume and chart - and/or at least monitor.

If you make 'testPerf' fail - by crashing it 'assert(false);' or somesuch; then the Makefile will kindly remind you of how to run just this one test - which is what you want.

Then - it would be ideal to have a set of CSV files - and prolly we should do this in the normal unit tests anyway for good measure and we should in each case use the git hash as the primary key / first item.

We should prolly split CPU, vs, Latency vs. Network - and generate 3x separate CSVs.

CPU should have the run-time in it, and as we go forward - more and more accurate CPU metrics - ideally from the libpfm API not the SysStopwatch class. But for now just getting something we can graph is key.

For Latency - we have a histogram we should horizontal-ize into CSV

And for Network - we should dump incoming & outgoing bandwidth, and then have some defined column headers for each type of thing, and dump the breakdown there so we can see it over time. I expect bandwidth to be the most reliable indicator here - and the others to jitter unhelpfully between runs =)

@Minion3665 can help with code pointers I expect.

Thanks !

Jun 13 '24 13:06 mmeeks

Hi @mmeeks ,

Please assign this to me if no one else is assigned yet. I would like to contribute

Jun 18 '24 05:06 amkarn258

Hi Mayank - thanks so much for getting involved! I filed this for an intern - Elliot over the summer - but as long as you commit your code to a branch regularly, no doubt you could work together with him to improve this :-) I don't expect Elliot to start looking at this for another week or so - so - go for it ! =)

Jun 18 '24 08:06 mmeeks

Elliot's work merged here: https://github.com/CollaboraOnline/online/pull/9373

Jul 01 '24 15:07 mmeeks

@elliotfreebairn1 so - some other thoughts for expansion:

how jittery are the numbers - for the same commit ? can you do some stats on that & build a nice spreadsheet & attach here ?
can we record other interactive traces - and/or refresh the traces we have to make them re-playable, I suspect our existing traces are rather out of date
can we connect more traces into the performance testing framework; ideally we'd generate different metrics for each of them https://perf.libreoffice.org/ has some examples of how that might look - and we could re-use / build on that framework.

Thanks ! =)

Jul 01 '24 15:07 mmeeks

@mmeeks For the 1st point, do you mean running the same unit tests on this device a fair few times and analysing the variations in the data?

Jul 01 '24 15:07 elliotfreebairn1

@mmeeks For the 1st point, do you mean running the same unit tests on this device a fair few times and analysing the variations in the data?

@elliotfreebairn1 correct, I think so - I recall you showing me that the results were pretty stable, but it would be nice to get a spreadsheet here too :)

Jul 01 '24 15:07 Minion3665

@Minion3665 @mmeeks Here are some graphs i created via matplotlib to show the stability/variation in data:

Screenshot from 2024-07-02 15-05-31

Also here is the repository where the script is located: https://github.com/elliotfreebairn1/CSVPerf

Jul 02 '24 14:07 elliotfreebairn1

I would really prefer us to use our own tool & charting engine, share the spreadsheet & have something that can be interacted with =) can you provide the results as a spreadsheet; ideally ODS. Do we have a CPU usage graph ?

I'm also interested if we measure peak memory usage; do we have a metric for that (?) if not we need to think about adding one; doing that at the malloc/free level while possible will cause performance angst - so we should prolly get the kernel's take of memory usage from /proc - at some well defined points: startup, and then per-document (which should subtract that). Mesauring memory is tricky - PSS is generally a good metric to parse out - particularly for one process - can you add something there ?

Thanks!

Jul 02 '24 15:07 mmeeks

@mmeeks Yeah i can definitely get that in a spreadsheet. I've just realised i've missed out the CPU usage measurements inside UnitPerf.cpp, so i will get that graphed aswell.

I've had a look around, and there doesn't seem to be a peak memory usage so will try to implement that as soon as possible and hopefully get in that spreadsheet. Thanks for giving some pointers :)

Jul 02 '24 16:07 elliotfreebairn1

Link to spreadsheet: PeformanceCharted.ods

Jul 05 '24 13:07 elliotfreebairn1