moin icon indicating copy to clipboard operation
moin copied to clipboard

Page trail: checking the existence of items slows down the response time for larger wikis

Open UlrichB22 opened this issue 1 year ago • 3 comments

I tested an imported wiki with 2400 items and about 12000 revisions. Each whoosh index query lasts about 120 ms. At the start of my tests the browser history was empty, I just showed TestItem01, TestItem02, ... TestItem05 and again TestItem01. The following server output was produced with additional clock timers:

DEBUG 2024-10-17 21:55:54,261 moin.utils.clock:48 timer total(0): 1564.11ms /TestItem01
DEBUG 2024-10-17 21:55:58,699 moin.utils.clock:48 timer total(0): 1602.45ms /TestItem02
DEBUG 2024-10-17 21:56:03,658 moin.utils.clock:48 timer total(0): 1795.20ms /TestItem03
DEBUG 2024-10-17 21:56:08,248 moin.utils.clock:48 timer total(0): 2165.31ms /TestItem04
DEBUG 2024-10-17 21:56:13,020 moin.utils.clock:48 timer total(0): 2321.03ms /TestItem05
DEBUG 2024-10-17 21:56:17,832 moin.utils.clock:48 timer total(0): 2376.22ms /TestItem01

With 5 entries in the page trail, the response time increased by 800 ms. Two additional index queries are used for each item.

The relevant parts in the code are:

https://github.com/moinwiki/moin/blob/807bfe10d56cf37e878667e0363849ab1f7e3df4/src/moin/themes/init.py#L352

https://github.com/moinwiki/moin/blob/807bfe10d56cf37e878667e0363849ab1f7e3df4/src/moin/themes/init.py#L373

UlrichB22 avatar Oct 17 '24 20:10 UlrichB22

IMO there are two solutions to regain performance:

  1. add the exists status while showing an item to the page trail. If an item is deleted in parallel by someone else the status may not be accurate.
  2. do not add non-existing items to the page trail and remove the checks.

When I try to open an item by entering a non-existing itemname in the browser URL the create dialog is show. At this moment the item is already added to the page trail even if I leave the dialog without adding the item. Not sure if this is useful.

UlrichB22 avatar Oct 17 '24 20:10 UlrichB22

120 ms seems like a long time just to see if an item exists. I keep thinking that somewhere in the processing below storage.get_item there is some code that opens the data file of every item in the page trail and includes it as part of the Item object.. But I have been unable to find it. It would be more comforting if the procedure name were storage.get_item_meta.

Agree that having non-existent items in the page trail is not very useful, and your solutions above.

RogerHaase avatar Oct 18 '24 17:10 RogerHaase

Thanks, I will check the code in storage.get_item.

UlrichB22 avatar Oct 18 '24 21:10 UlrichB22

As far as I can see, the storage.get_item method does not read the data file.

UlrichB22 avatar Nov 10 '24 13:11 UlrichB22

Agree, storage.get_item does not read the data file nor open the file.

Prior performance improvements avoided opening a file that was never read but opened. As in creating an Item object with an open file when only meta data was needed.

RogerHaase avatar Nov 10 '24 16:11 RogerHaase