crengine icon indicating copy to clipboard operation
crengine copied to clipboard

Slow documents for profiling

Open Frenzie opened this issue 5 years ago • 10 comments

By coincidence I came across this book, which seems to take a small eternity to change pages. (Okay, just a second, but that's a really long time.)

https://www.smashwords.com/books/view/631261

Anyway, so I figured I'd profile it. I'm not sure if that's helpful in this case but there you have it. ;-)

callgrind.out.12948.tar.gz

Same file as posted over at https://github.com/koreader/crengine/commit/8f15e01c8aea15920381f14ea85b0b4abeb36cfd#r32965759

Screenshot_2019-03-29_16-15-42

Btw, general remark, you can get the machine code with --dump-instr=yes, as in valgrind --tool=callgrind --dump-instr=yes ./reader.lua.

Frenzie avatar Mar 29 '19 15:03 Frenzie

Not noticing much slowness on the emulator :)

But this book is a bit different from most in that there are thousands of <p> in a <body>, each line is in its own <p>, and you have many such <p> on a page. So, many many rects, compared to book with larger paragraphs where only the paragraph is a rect. So, as crengine check each container top y and bottom y if it overlaps the page, and the go look at this contained children to find those that overlap, once it has found the BODY, it needs to check each and all <p> - and the same when on a page, and you're tapping, it needs to check if you tapped on a link, and I think it has to go thru all the <p> to find those on the page, and if they happen to be where you tapped. And even more of the same work if you have some highlights, to get the Rects to highlights.

969 323 / 21 could indicate there are 46142 <p> :) DrawDocument is called and itself checks if the node is in or out of page. 21 taps(?), 21 body to check, 46142 P in the current BODY. I guess it all happens there: https://github.com/koreader/crengine/blob/621dee08d48e5728f0ee0db4be200d71b55e7e8a/crengine/src/lvrend.cpp#L3916-L3948

I guess just moving all the lines from int em to int padding_top AFTER the if ( (doc_y + height <= 0 ... return; could help a lot, as we don't need all that that early if we return.

Can you check and time it ?:)

poire-z avatar Mar 29 '19 15:03 poire-z

Except int height, that is needed. So:

--- a/crengine/src/lvrend.cpp
+++ b/crengine/src/lvrend.cpp
@@ -3932,20 +3932,20 @@ void DrawDocument( LVDrawBuf & drawbuf, ldomNode * enode, int x0, int y0, int dx
         // A few things differ when done for TR, THEAD, TBODY and TFOOT
         bool isTableRowLike = rm == erm_table_row || rm == erm_table_row_group ||
                               rm == erm_table_header_group || rm == erm_table_footer_group;
-        int em = enode->getFont()->getSize();
-        int width = fmt.getWidth();
         int height = fmt.getHeight();
-        bool draw_padding_bg = true; //( enode->getRendMethod()==erm_final );
-        int padding_left = !draw_padding_bg ? 0 : lengthToPx( enode->getStyle()->padding[0], width, em ) + DEBUG_TREE_DRAW+measureBorder(enode,3);
-        int padding_right = !draw_padding_bg ? 0 : lengthToPx( enode->getStyle()->padding[1], width, em ) + DEBUG_TREE_DRAW+measureBorder(enode,1);
-        int padding_top = !draw_padding_bg ? 0 : lengthToPx( enode->getStyle()->padding[2], width, em ) + DEBUG_TREE_DRAW+measureBorder(enode,0);
-        //int padding_bottom = !draw_padding_bg ? 0 : lengthToPx( enode->getStyle()->padding[3], width, em ) + DEBUG_TREE_DRAW;
         if ( (doc_y + height <= 0 || doc_y > 0 + dy) && !isTableRowLike ) {
             // TR may have cells with rowspan>1, and even though this TR
             // is out of range, it must draw a rowspan>1 cell, so it it
             // not empty when a next TR (not out of range) is drawn.
             return; // out of range
         }
+        int em = enode->getFont()->getSize();
+        int width = fmt.getWidth();
+        bool draw_padding_bg = true; //( enode->getRendMethod()==erm_final );
+        int padding_left = !draw_padding_bg ? 0 : lengthToPx( enode->getStyle()->padding[0], width, em ) + DEBUG_TREE_DRAW+measureBorder(enode,3);
+        int padding_right = !draw_padding_bg ? 0 : lengthToPx( enode->getStyle()->padding[1], width, em ) + DEBUG_TREE_DRAW+measureBorder(enode,1);
+        int padding_top = !draw_padding_bg ? 0 : lengthToPx( enode->getStyle()->padding[2], width, em ) + DEBUG_TREE_DRAW+measureBorder(enode,0);
+        //int padding_bottom = !draw_padding_bg ? 0 : lengthToPx( enode->getStyle()->padding[3], width, em ) + DEBUG_TREE_DRAW;
         css_length_t bg = enode->getStyle()->background_color;
         lUInt32 oldColor = 0;
         // Don't draw background color for TR and THEAD/TFOOT/TBODY as it could

poire-z avatar Mar 29 '19 15:03 poire-z

The path on the left of your image is there: https://github.com/koreader/crengine/blob/621dee08d48e5728f0ee0db4be200d71b55e7e8a/crengine/src/lvtinydom.cpp#L12071-L12102

Same issue - top_margin & bottom_margin computed early - but less obvious to fix as these are used to decide about pt.y and return early. May be some assumption than a margin can't be more than a page height (but it can) to check for that early and return and discard most - and next do a more precise check with the top and bottom margin computer...

poire-z avatar Mar 29 '19 16:03 poire-z

@poire-z Oh drat, so I didn't profile correctly!

I noticed the slowness while tapping, but I didn't recreate the conditions properly by using page up/down for the profiling. Which isn't actually very slow. Except that slightly misleadingly, in Valgrind it is.

Frenzie avatar Mar 29 '19 17:03 Frenzie

No profiling yet, but your suggested changes do feel slightly more responsive.

Frenzie avatar Mar 29 '19 17:03 Frenzie

I'll probably PR the first one and include it in the coming crengine bump as it looks super safe. Will give some thoughts to the second one this evening, some small optimisation looks possible.

But all that could be furthermore optimised by caching many stuff in credocument (the buffer, the rects for links, the rects for highlights) so we don't request them from crengine when playing with the menu, dict lookup... I had some try in the past, the gain was noticable, but the crengine times were so below the blitting times that it wasn't worth complexifying credocument.lua. But now that @NiLuJe has speed up the blitting...

poire-z avatar Mar 29 '19 17:03 poire-z

Proper profile of the current code for potential future reference. Incidentally, on my desktop it's practically faster in Valgrind than on my laptop without. Quite sad. :-)

callgrind.out.8576.tar.gz

And here it is with the changes:

callgrind.out.25914.tar.gz

Frenzie avatar Mar 29 '19 19:03 Frenzie

Looked at them, so a big rectangle on the left disappeared, and make all the other get bigger. So, I guess it's good :) Tested on my Kobo, and it indeed feels smoother. Anyway, PR'ing safe changes.

poire-z avatar Mar 29 '19 19:03 poire-z

Here's one that could potentially be worth a look.

nwt_E.epub.zip

Via https://www.mobileread.com/forums/showthread.php?p=3870059#post3870059

Frenzie avatar Jul 22 '19 19:07 Frenzie

Just a quick note, for info and "good to know":

I'm reading a 67 KB small EPUB which results in 110 pages. I have some tuning in cr3.ini to have crengine use a cache for books smaller than our default. I noticed that re-opening this book was taking 2s, even though a cache is made and used - so a longer time than most other larger books :/

It turns out this book (by Thomas Bernhard, dunno if he is known for that) is a single 110-pages paragraph - so crengine has to format this whole paragraph for drawing slices of it, which makes the re-opening as long as a re-rendering (after this opening, this formatting is cached in memory, so everything then is quick).

poire-z avatar Oct 08 '21 12:10 poire-z