Performance degrades on larger files
I love Rnote and that's why I decided to use it to take notes. The problem I'm having is that as the amount of elements in the canvas increases (I use the continuous vertical) the performance tends to degrade making it difficult to write fluidly and see the precise lines of what you've written on the screen, more than anything else only incomprehensible signs appear. I read a bit about the project before writing here and I noticed that there are already some ideas about changing the format to make it better. I understand the complexity behind it and that it will take time, but do you think that these changes will solve this problem? Are there currently any solutions to the problem other than separating the content into multiple files? I'll leave a demonstration video, the problem in this case is even more amplified due to the screen recording (I use Windows, the file in question is 8 pages)
Video:
https://github.com/user-attachments/assets/9a7a87c3-8a07-459b-814b-0345f3a105d0
What is the size of the file that you made this video on? Behaviour like the one in the video is not supposed to happen no matter the size of the file. Can you activate developer mode and show visual debugging information (in the top right menu)? There should be information on the total amount of strokes etc.
What is the size of the file that you made this video on? Behaviour like the one in the video is not supposed to happen no matter the size of the file. Can you activate developer mode and show visual debugging information (in the top right menu)? There should be information on the total amount of strokes etc.
The .rnote file size is 1.42MB, the file contains quite a bit of text (not handwritten but keyboard-typed), a couple of images and several hand-drawn strokes.
This should really not cause an performance issue than. What are your hardware specs? Have you tried monitoring resource usage of rnote and your machine in general?
This should really not cause an performance issue than. What are your hardware specs? Have you tried monitoring resource usage of rnote and your machine in general?
As long as there are not many strokes everything works fine, the problem begins when the documents start to get bigger (not in the sense of memory occupied but in the number of elements of the json file that describes the canvas). I provide you with the performance of the hw in which I am using it which is not very performing but should be acceptable (Maybe the video card could be a bottleneck): CPU: AMD Athlon 300U GPU: AMD Radeon Vega 3 (Picasso) 2GB RAM: 8GB DDR4 SDRAM OS: Windows 10 Home (x64)
Edit: I tried with: CPU: Intel Core i5-7500 GPU: NVIDIA GeForce GTX 1060 6GB RAM: 16GB DDR4 OS: Windows 10 Home (x64) The situation improves although it continues to sometimes freeze and write incomprehensible lines (especially if the writing is fast)
I've had exactly the same issue; just ignored it. I don't think this is a hardware thing - I've been through 2 different laptops in the past couple years using Rnote and this same problem has presisted.
This is 100% some type of bug or optimisation issue ^
Can reproduce semi-reliably. Not sure what the cause is. Could be that no events are received when the gpu renderer is in progress (so that if fps drop the events aren't here) ? Or somehow events are skipped/dropped ?
Can reproduce semi-reliably. Not sure what the cause is. Could be that no events are received when the gpu renderer is in progress (so that if fps drop the events aren't here) ? Or somehow events are skipped/dropped ?
I can't understand the problem either, I think it's something that has to do with the rendering of the stokes that, in a file with many strokes, wastes time before appearing on the canvas, resulting in fact altered. So a problem could be in the algorithm that optimizes the stokes (which from what I know are not raw) or in something that has to do with the UI. From the tests done it seems to be a CPU processing problem since with the dedicated video card I noticed the same problem. In a PC with excellent performance it is possible that this does not occur but to reproduce the error you could try to stress the CPU or GPU a bit and try in these contexts if it goes well (even trivially with an active screen recording). As far as I can I will investigate more on the problem and eventually I will update the post.
If it can be useful I attach another video with details of the resources in use highlighted.
https://github.com/user-attachments/assets/40fa7dbc-1c89-41db-9571-1efc4cc9b2dd
I later tried to use the simple model for the stroke, the result is something understandable compared to what I actually wrote but very sketchy and always with a delay of a few seconds before it appears on the canvas. The file on which I do these tests is always the one quite large, on files of a single page I have no problems.
Edit: If it's of any interest, I’ve noticed that even when typing with a keyboard in the large file, there’s a delay—although all the letters do eventually appear. It seems to be a rendering or processing issue: with keyboard input, characters are handled correctly despite the lag, but with pen input, some parts are lost, resulting in strokes that look very different from the actual handwriting
If it can help, I noticed that the problem occurs when switching between strokes. For example, if I continue to write the same stroke, it continues to write it in a fairly fluid manner, the problem occurs when you lift the pen to write another stroke (which in handwriting often happens)
Video:
https://github.com/user-attachments/assets/bf247581-e447-4c42-bb95-0da13a832d78
So I just pushed this to the extreme and created a file with 70k strokes. I managed to reproduce the lag when inserting new strokes, but I noticed that the lag was getting better the more I wrote on the new page. So it might be related to the stroke store doing some weird re-balancing when inserting in a previously unpopulated space. Not sure how to resolve this though.
So I just pushed this to the extreme and created a file with 70k strokes. I managed to reproduce the lag when inserting new strokes, but I noticed that the lag was getting better the more I wrote on the new page. So it might be related to the stroke store doing some weird re-balancing when inserting in a previously unpopulated space. Not sure how to resolve this though.
Personally, even leaving some space from the previous sections (I tried to get up to 4 pages ahead), it still goes badly, you can't write anything understandable. It would be nice if this bug was solved, it's a shame that rnote can't be used for heavier issues like keeping a whole notebook of notes on not overly performing PCs. Thank you very much for your contribution
I do have some ideas for this.
The first doubt I have is concerning the behavior of connect_event, particularly self.pointer_controller.connect_event which handles all pen input through super::input::handle_pointer_controller_event. If this function takes more than a frame to complete, does that block gtk from doing the next frame ? It's part of the event propagation mechanism and gtk needs the return value to continue/stop the event propagation so it could be blocking in some way (do we freeze gtk rendering until the event loop is finished ? do we have an abort mechanism ?).
One could check inside of handle_pointer_controller_event with debug traces whether or not the events are still present when the bug occurs or not (do we have "jumps" in these ? Do we have late events saved in the event history - if more than one event occured between two frames, you get the additional ones in the event history, say for a 240 Hz pen on a 120 Hz screen-). Of particular interest is what happens inside retrieve_pointer_elements and what gets filtered/not filtered from event.history().into_iter() (maybe the timings gets messed up somehow and we reject events from this). The real timing is given by the gtk timestamp, like done in https://github.com/flxzt/rnote/pull/1290/files#diff-488bca4d88a2d32fbe68851282e211cf93e04b446a54e79a25f4e73081d86de2L370
Now, as for the fact that this
So I just pushed this to the extreme and created a file with 70k strokes. I managed to reproduce the lag when inserting new strokes, but I noticed that the lag was getting better the more I wrote on the new page. So it might be related to the stroke store doing some weird re-balancing when inserting in a previously unpopulated space. Not sure how to resolve this though.
Now for this I have some idea. When you first do a pen down, there is a call to engine_view.store.insert_stroke which in turns calls
self.key_tree.insert_with_key(key, bounds);
a Maybe that is the culprit line taking too much time on the start of the stroke This adds the current stroke with the start bounds to a rtree (used for faster positional queries for intersections/selections. Maybe not everywhere though but you get the point)
There is also for each subsequent elements added to the stroke a call to update_geometry_for_stroke which calls
self.key_tree.update_with_key(key, stroke.bounds());
For pen strokes this might be a performance footgun (because it removes the element from the keytree then adds it back with the new (growing) bounds. I'm pretty sure (TBC) that the keytree won't be called when writing a stroke (or maybe the current stroke doesn't have to be queried) so we probably can remove these calls for pen strokes (and only keep a last one upon the stroke end to add the stroke to the rtree only at the end. Maybe it'd still not be enough (but at least there is more time between the end of a stroke and the start of a new one vs two consecutive pen input events).
Also there might be some rebalancing of the rtree that ends up getting costly because of these updates (or maybe that doesn't update greatly because of the start insert). Maybe that'd explain why it seems to run better after a restart (same stroke but different/rebuilt rtree that looks better ? From #1438). For that maybe we can try save the rtree between the two situations.
Also the interface to the rtree could be changed for another data structure (if we can find another one that's better in practice for our uses as far as performance goes. Maybe a tiled structure ? That can motivate changes in the file system/renderer part)
Though maybe there are additional issue over memory usage on the renderer over time ?
Not sure why there's such a delay in displaying things in extreme case though. At the very least it'd be good to debug why the pen input gets distorted.
I do have some ideas for this.
Unfortunately I don't know Rust and surely my knowledge is inferior to yours, I'm very happy however that this problem has been taken into consideration. What I can say, observing the behavior, is that the problem is not only in strokes but also when writing text from the keyboard there is a delay when the file starts to get large, as if there was a sort of buffer that accumulates the letters pressed and, after a few seconds that you finish typing, all the text appears (this behavior is not present even if the file is small since each character appears at the instant in which it was pressed). This leads me to think that the main problem is not on the pen itself, but on something that concerns the rendering. For stokes the problem, in my opinion is similar, as a sort of buffer that accumulates the points but in this case some are lost with the result that the writing is incomprehensible and made mainly of straight lines (I'll show you a visual debug video in which I draw circular, sinusoidal shapes, but "cuts" them with straight lines). Also there is a delay from when you start pressing until the pen actually focuses. In the second video I draw a circle quickly, without waiting for it to focus. In the third I wait a second holding down on the same spot for it to focus and try to draw a circle but a triangle comes out.
Video 1:
https://github.com/user-attachments/assets/6c93a304-e703-46c4-8fd2-ef233e338437
Video 2:
https://github.com/user-attachments/assets/4481f9b3-7873-4211-9f44-09180e2e9af3
Video 3:
https://github.com/user-attachments/assets/97c2f72d-343c-4d56-8775-9e7bfc27a260
Video extra, try to write by keyboard (the text appears several seconds after I finish typing):
https://github.com/user-attachments/assets/544b75eb-b0d4-4f66-8e43-47a076e443b3
Ps: In my case the file does not have excessive strokes but a lot of text written by keyboard (20 pages in total). I am available, eventually, to try some test versions.
Edit: If it can be of interest I have with version 0.9.4, the problem is still there, rather than straight lines it writes dots but even in this case incomprehensible and with a delay
I've started to debug print and add artificial delays to the input function. As expected, this function is blocking meaning if it takes longer than a frame the fps drops. Putting a delay that force a fps < 15, it becomes choppy but the event history compensates for that (so I get 41 events per frame). I can try to force a larger delay to see if this breaks at some point.
Can you run rnote with GTK_DEBUG=interactive (either GTK_DEBUG=interactive rnote if this is from the package manager or flatpak --env=GTK_DEBUG=interactive run com.github.flxzt.rnote (I've not checked the last part, a tab should complete it) and go to to settings in the second window, then activate the show framerate option ? This adds a small fps counter on the top right corner of the screen. So that I can know what fps we get to.
Not 100% sure my first idea is correct now, maybe if I go harder ?
I don't really know about the GTK event loop, but to me it seems like we are running into a bit of an issue with the R-Tree. The RTree we use, is a R*-Tree, which is optimized for querying performance. The tradeoff is insertion complexity, which increases to O(NlogN). This seems to me like the place that we are spending most of our time at, as the lag only seems to occur when starting a new stroke. During the stroke performance is fine, because we are using the fast querying to get a mutable reference to the current stroke and are modifying it in place. We could either solve this by swapping out the RTree (not sure which other datastructure would be more appropriate though) or have some way of notifying the rendering of a new stroke without going through the R-Tree and have the insertion run async in the background, which would probably lead to tons of issues and bugs so its not really an option.
Well I thought it was that but it seemingly isn't (it could very well cause lag but seemingly not cause the "curves are line" scenario). Tested only on windows though
I want to find why we lose events first before doing optimisations.
The issue with the R-tree is probably that this N is also the N on all pages (we don't have a "per page" separation that could help us here)
I can't reproduce the videos. I have tried increasing the amount of strokes to totaly absurd amounts (100k) but the strokes were still modeled perfectly. If I turned off visual debugging, I didn't even notice performance impacts, except for the short delay when inserting. Maybe it has something to do with the path modeling of the stroke as well?
Can you run rnote with
GTK_DEBUG=interactive(eitherGTK_DEBUG=interactive rnoteif this is from the package manager orflatpak --env=GTK_DEBUG=interactive run com.github.flxzt.rnote(I've not checked the last part, a tab should complete it) and go to to settings in the second window, then activate theshow framerateoption ? This adds a small fps counter on the top right corner of the screen. So that I can know what fps we get to.
Something strange is actually happening. Os Winows 10 Home
https://github.com/user-attachments/assets/3c41e77b-cbf6-4576-95d7-398d55200ed9
This seems to me like the place that we are spending most of our time at, as the lag only seems to occur when starting a new stroke. During the stroke performance is fine, because we are using the fast querying to get a mutable reference to the current stroke and are modifying it in place.
I agree with this experience
Maybe it has something to do with the path modeling of the stroke as well?
We could profile this part (and maybe optimize that part) but if I'm not mistaken @Intranox also has the issue with the simple rendering part (is it the case? can you retest with the simple path modeling and maybe do 3-4 lines in quick succession to see if they all appear and if so if they appear with only 2 elements or more?)
Something strange is actually happening. Os Winows 10 Home
Seems like perf actually tanks here. Is it at the start of a stroke?
I've gone further and added an arbitrary delay to the input handler. If I do one stroke it's fine (no loss).
But if I do multiple in succession I don't get everything. That's probably because the pen mode is changed only once per call (from the event itself so the latest element sent for that frame) and then we send all the events in the history + the latest one in that order to be handled by the pen/brush/tool. So we probably send to the penpath**builder a series of events with Down....Down Up... Up Down ... Down Up (if there is the equivalent of two strokes in the history). Maybe at that stage this works out (we get the full path for both), or maybe it stops at the first one? I'm not sure what's the actual mechanism at play here (should it actually work here ?).
For pens there is no rate limiting (we can only lose events if the timings doesn't make sense).
So I can reproduce (artificially with additional delays) the lag + having stroke not appear but not the fact that events seem to disappear.
We could profile this part (and maybe optimize that part) but if I'm not mistaken @Intranox also has the issue with the simple rendering part (is it the case? can you retest with the simple path modeling and maybe do 3-4 lines in quick succession to see if they all appear and if so if they appear with only 2 elements or more?)
In the first video I tried to write with the simple model, the result is similar to the previous case. In the second video I tried to make 3 lines in succession that were represented with a delay, then I tried to make two circles next to each other but the result was always a line. Note: the problem with screen recording is amplified, if I try to make a circle without recording it creates a straight line and then starts to represent the circle in a rough way. This suggests to me that the problem is the delay from the first press to draw a stroke and when it actually starts to draw it (in this time frame it loses information on the stroke that is being drawn). In handwriting, since many consecutive strokes are written quickly, the delay accumulates and the shapes do not come out as actually drawn
Video 1
https://github.com/user-attachments/assets/1d3f1f8b-7474-47cd-a67f-b17979404298
Video 2
https://github.com/user-attachments/assets/cf0b7479-85ed-4d15-a7ab-85ac8ef378a4
Image: a circle drawn with the simple model, the first straight line should be curved, but it turned out so because of the delay between the moment I started drawing and the moment it actually started drawing. I think that is why the result of a handwritten text are straight lines. (In the circle you hold the pen down and after a while it actually starts to take the shape of a circle, in the written text the pen is often released immediately and the result will be just a straight line).
Just to be thorough: if you save the file then reopen it, circles are still lines right ?
Maybe get_stroke_mut sometimes fail to get the current brush stroke to push elements (so these don't get added to the stroke) ?
I can maybe get some debug windows build to work, getting the output to show in the terminal is a little more complicated.
> $env:VAR="value"
> gtk4-demo *>&1 | echo
Just to be thorough: if you save the file then reopen it, circles are still lines right ?
Yes, if I save the file and reopen it I find everything exactly as I closed it.
We could either solve this by swapping out the RTree (not sure which other datastructure would be more appropriate though) or have some way of notifying the rendering of a new stroke without going through the R-Tree and have the insertion run async in the background, which would probably lead to tons of issues and bugs so its not really an option.
I don't know much about it and I don't know the implementation used. However, in my opinion it would not be necessary to remove the data structure but to optimize it, for example by maintaining a separate queue that keeps the most recent strokes and inserts them asynchronously and periodically into the tree. The queue will be fast and the insertion would be sustainable because there are usually pauses while writing. However, as said, I don't know if it's actually a good idea or not
Any updates on this issue? Unfortunately I understand that it cannot be immediately resolved but I hope to be able to use Rnote to take study notes without the size problem. I note however that the problem is more intensified if there is a lot of text written on the keyboard, if you write by hand you can make a few more pages. I would also like to ask: what function is launched when you put the pen down? I don't think I'm capable of helping but at least I can try.
So I've done a first debug windows build here : https://github.com/Doublonmousse/rnote/actions/runs/15145490559 If you download the artefact, you get a zip containing a modified rnote exe (built on this branch https://github.com/Doublonmousse/rnote/actions/runs/15145490559)
Install this, then open powershell
cd "C:\Program Files\Rnote\bin"
.\rnote.exe *>&1 | echo
The first path may be slightly different if you install it somewhere else. Then rnote opens and I've added a few debug print (checking whether a condition fails and could explain the loss of pen input + printing the initial time spent inserting elements in the rtree for each new stroke). Maybe I can get a rough idea from this.
I want to understand why we lose events and start to get a feel for where the perf pitfall is (this might be linked though)
Beware this version has a modified rnote file format. So if you can reproduce the issue on this version that'd be great but please make sure to backup your files if you open previous file in that version (and don't forget to go back to the regular 0.12 version after).
Another part of the code I'm getting doubt about is the get_stroke_mut function. We're calling Arc::make_mut on the Stroke and that can clone the inner data if there is a weak reference somewhere else. So we potentially clone the Stroke and edit a copy and not the one that's actually in the engine. Depending on your reply, I'll try to add other checks in other places.
In the logic, maybe we can also put some current data inside the brush structure (if we know we're reusing the value from the start to the end of a stroke), that could help here.
So I've done a first debug windows build here : https://github.com/Doublonmousse/rnote/actions/runs/15145490559 If you download the artefact, you get a zip containing a modified rnote exe (built on this branch https://github.com/Doublonmousse/rnote/actions/runs/15145490559)
Install this, then open powershell
Thank you very much. I tried the indicated version and the result is visible on the video.
https://github.com/user-attachments/assets/dcc29747-c4ae-48e4-bd8c-e7a89ec1bd93
Note: the provided installer was missing the libraries libcrypto-3-x64.dll and libssl-3-x64.dll which I simply copied and pasted from the latest release to make it work
Test in an empty file (no problem):
https://github.com/user-attachments/assets/58abad66-6e70-4ad0-b7d9-ae11169a6ce3
Given the results it seems to me more likely that the problem lies in Arc::make_mut, one idea would be to test the timings for everyone and see if there is delay and the total time taken
EDIT: I have come to the conclusion that the problem is not so much the number of strokes but how complex they are. I have noticed that a block of text written by keyboard is inserted as a single stroke, if this contains a lot of information the performance degrades faster. I have created a test file with the debug version in which I have inserted a few blocks of text with a lot of textual content and the problem is reproducible even with a number of strokes less than 15. Handling these blocks is expensive as well as copying them. The created file is attached. You might also consider measuring the time taken by stroke.bounds()
Yeah, so I'm wrong on both accounts then
- the
insert_strokedoesn't seem to take more time to process (in the microseconds there) - There is no apparent failure to push new elements to the current stroke on each update (if I'm wrong then it's not where I think it is)
I'll try to take a look at the window build missing libraries. Now if the rendering is slow (could be a general issue for future improvments, gpu rendering etc...) that doesn't explain why that'd impact so much handwriting when the strokes are not visible.
You might also consider measuring the time taken by stroke.bounds()
Maybe I can time the whole insert_stroke function then
I want to also track whether/(and if so where) we are actually losing events (given that I can't reproduce the handwriting getting messed up). Probably won't do this today though
Maybe I can time the whole
insert_strokefunction thenI want to also track whether/(and if so where) we are actually losing events (given that I can't reproduce the handwriting getting messed up). Probably won't do this today though
The delay is not in the insertion of the tree and not even proportional to the number of strokes but to the general complexity, at least this seems to have come to light. I still have doubts about the insert_stroke function, especially on Arc::make_mut because 4 are made. In my opinion a more detailed debug could be done by viewing each part so as to be able to say exactly which piece is wasting time or if the insert_stroke is not completely involved. I am convinced, even if not sure, that the delay is also the cause of the incomprehensible writing or in any case by solving the first one the second is also solved in some way
I've done another debug build here https://github.com/Doublonmousse/rnote/actions/runs/15226437777 I've not yet looked at the dll issue though.
Could you retry with the simple model for curves and show the results ? I want to make sure that there's no loss of strokes (and if so where this happens). The log is going to be a little larger this time, so if you could copy paste the terminal output that'd be great
Depending on the results, I'll ask for a more detailed log (using the trace info) with
$env:RUST_LOG = "trace"
set before running rnote
I've done another debug build here https://github.com/Doublonmousse/rnote/actions/runs/15226437777 I've not yet looked at the dll issue though.
I'm trying to familiarize myself a bit with the debugs inserted and will edit the answer as I understand more, in the meantime I tried to draw a circle with the simple model (as shown previously on the thread) and the results are these while the resulting circle has an initial straight line.
HISTORY: size of the history before filtering 1 and after 1
HISTORY: event type ButtonPress. History is not read in that instance. History size 0
(rnote.exe:8576): Gdk-CRITICAL **: 15:25:50.894: gdk_event_get_history: assertion
'GDK_IS_EVENT_TYPE (event, GDK_MOTION_NOTIFY) || GDK_IS_EVENT_TYPE (event, GDK_SCROLL)'
failed
(rnote.exe:8576): Gdk-CRITICAL **: 15:25:50.895: gdk_event_get_history: assertion
'GDK_IS_EVENT_TYPE (event, GDK_MOTION_NOTIFY) || GDK_IS_EVENT_TYPE (event, GDK_SCROLL)'
failed
insertion of the stroke took Ok(1.2329ms) seconds with Ok(900ns) seconds for stroke bounds, Ok(809.3┬Ás) for insertions
BRUSH_STATS: start of brush stroke with builder Simple
HISTORY: size of the history before filtering 3 and after 2
HISTORY: size of the history before filtering 2 and after 1
HISTORY: size of the history before filtering 1 and after 1
HISTORY: size of the history before filtering 1 and after 1
HISTORY: size of the history before filtering 1 and after 1
HISTORY: size of the history before filtering 1 and after 1
HISTORY: size of the history before filtering 1 and after 1
HISTORY: size of the history before filtering 1 and after 1
HISTORY: size of the history before filtering 1 and after 1
HISTORY: size of the history before filtering 1 and after 1
HISTORY: size of the history before filtering 1 and after 1
(rnote.exe:8576): Gdk-CRITICAL **: 15:25:51.243: gdk_event_get_history: assertion
'GDK_IS_EVENT_TYPE (event, GDK_MOTION_NOTIFY) || GDK_IS_EVENT_TYPE (event, GDK_SCROLL)'
failed
(rnote.exe:8576): Gdk-CRITICAL **: 15:25:51.243: gdk_event_get_history: assertion
'GDK_IS_EVENT_TYPE (event, GDK_MOTION_NOTIFY) || GDK_IS_EVENT_TYPE (event, GDK_SCROLL)'
failed
HISTORY: event type ButtonRelease. History is not read in that instance. History size 0
BRUSH_STATS: End of stroke with key Some(StrokeKey(8488v7)), sent 14 to the builder, got 15 back.
BRUSH_STATS: In the store, the key Some(StrokeKey(8488v7)) has 15 elements
HISTORY: size of the history before filtering 1 and after 1
The thing I notice is that that fail also occurs in a new document where the problem does not persist
Ok so, I think the first stroke weirdness comes from
HISTORY: size of the history before filtering 3 and after 2
rnote filters brushstroke event if they're too close to one another so that they're at least 4ms apart each. It seems like on the events that happen just after the pen touches the screen, one of the event gets cut out from this.
It seems one of the event isn't timed correctly or there's some other time weirdness like #1289 (so it gets filtered because it appears very close to another one). This can be changed with a change of backlog policy for the stroke (this is a single line so pretty easy to change and test).
The timing for the insertion seems a little high but not enough to explain the lag. But for each stroke the function that handles the input also request a redraw (and that happens for the current or next frame). So chances are the bottleneck is purely on the rendering side (and that makes the next call to the input handler hangs until the rendering is done)