moin icon indicating copy to clipboard operation
moin copied to clipboard

Indexing.py: Item init is called several times with the same query values when viewing an item

Open UlrichB22 opened this issue 1 year ago • 4 comments

While running performance tests on an imported wiki I found that the Item init method https://github.com/moinwiki/moin/blob/807bfe10d56cf37e878667e0363849ab1f7e3df4/src/moin/storage/middleware/indexing.py#L1064 is called several times with the same query values. Each Query causes a whoosh index query. Because of the index size for a wiki with 2400 items and about 12000 revisions each index query lasts about 120 ms. The following server output was produced with additional clock timers, see commit https://github.com/moinwiki/moin/commit/26bd5dc55ac1e114decab6d2e76c9186cc797b29.

For the test I created a simple MoinWiki item called 'TestItem' with a header only.

DEBUG 2024-10-11 23:08:20,354 moin.utils.clock:48 timer init(0): 3.29ms 
DEBUG 2024-10-11 23:08:20,530 moin.utils.clock:48 timer Item init(0): 174.80ms name: TestItem query: {'namespace': '', 'name_exact': 'TestItem'} itemid: cb33ee84eb3a4a54a31e2f4c5132ee5c latest_doc: False
DEBUG 2024-10-11 23:08:20,664 moin.utils.clock:48 timer Item init(0): 133.91ms name: None query: {'itemid': 'cb33ee84eb3a4a54a31e2f4c5132ee5c'} itemid: cb33ee84eb3a4a54a31e2f4c5132ee5c latest_doc: False
DEBUG 2024-10-11 23:08:20,665 moin.utils.clock:48 timer Revision init(0): 0.02ms name: ['TestItem'] revid: 308f87c48ab044058631c7007b7a5030
DEBUG 2024-10-11 23:08:20,665 moin.utils.clock:48 timer show item create(0): 310.24ms TestItem rev_id=current

DEBUG 2024-10-11 23:08:20,668 moin.utils.clock:48 timer conv_in_dom(0): 2.10ms 
DEBUG 2024-10-11 23:08:20,669 moin.utils.clock:48 timer nowiki(0): 0.05ms 
DEBUG 2024-10-11 23:08:20,669 moin.utils.clock:48 timer conv_include(0): 0.10ms 
DEBUG 2024-10-11 23:08:20,669 moin.utils.clock:48 timer conv_macro(0): 0.08ms 
DEBUG 2024-10-11 23:08:20,670 moin.utils.clock:48 timer conv_link(0): 0.05ms 
DEBUG 2024-10-11 23:08:20,672 moin.utils.clock:48 timer conv_dom_html(0): 1.98ms 

DEBUG 2024-10-11 23:08:21,024 moin.utils.clock:48 timer Item init(0): 112.57ms name: TestItem query: {'namespace': '', 'name_exact': 'TestItem'} itemid: cb33ee84eb3a4a54a31e2f4c5132ee5c latest_doc: False
DEBUG 2024-10-11 23:08:21,143 moin.utils.clock:48 timer Item init(0): 118.20ms name: TestItem query: {'namespace': '', 'name_exact': 'TestItem'} itemid: cb33ee84eb3a4a54a31e2f4c5132ee5c latest_doc: False
DEBUG 2024-10-11 23:08:21,143 moin.utils.clock:48 timer protecting may(0): 118.49ms TestItem
DEBUG 2024-10-11 23:08:21,266 moin.utils.clock:48 timer Item init(0): 121.99ms name: TestItem query: {'namespace': '', 'name_exact': 'TestItem'} itemid: cb33ee84eb3a4a54a31e2f4c5132ee5c latest_doc: False
DEBUG 2024-10-11 23:08:21,266 moin.utils.clock:48 timer protecting may(0): 122.24ms TestItem
DEBUG 2024-10-11 23:08:21,391 moin.utils.clock:48 timer Item init(0): 124.47ms name: /TestItem/Discussion query: {'name_exact': '/TestItem/Discussion', 'namespace': ''} itemid: undefined latest_doc: False
DEBUG 2024-10-11 23:08:21,510 moin.utils.clock:48 timer Item init(0): 118.51ms name: /TestItem/Discussion query: {'namespace': '', 'name_exact': '/TestItem/Discussion'} itemid: undefined latest_doc: False
DEBUG 2024-10-11 23:08:21,628 moin.utils.clock:48 timer Item init(0): 118.26ms name: /TestItem/Discussion query: {'namespace': '', 'name_exact': '/TestItem/Discussion'} itemid: undefined latest_doc: False
DEBUG 2024-10-11 23:08:21,628 moin.utils.clock:48 timer protecting may(0): 237.18ms /TestItem/Discussion
DEBUG 2024-10-11 23:08:21,768 moin.utils.clock:48 timer Item init(0): 139.43ms name: TestItem query: {'namespace': '', 'name_exact': 'TestItem'} itemid: cb33ee84eb3a4a54a31e2f4c5132ee5c latest_doc: False
DEBUG 2024-10-11 23:08:21,768 moin.utils.clock:48 timer protecting may(0): 139.65ms TestItem
DEBUG 2024-10-11 23:08:21,905 moin.utils.clock:48 timer Item init(0): 136.24ms name: TestItem query: {'namespace': '', 'name_exact': 'TestItem'} itemid: cb33ee84eb3a4a54a31e2f4c5132ee5c latest_doc: False
DEBUG 2024-10-11 23:08:21,906 moin.utils.clock:48 timer protecting may(0): 137.39ms TestItem
DEBUG 2024-10-11 23:08:22,038 moin.utils.clock:48 timer Item init(0): 130.20ms name: TestItem query: {'namespace': '', 'name_exact': 'TestItem'} itemid: cb33ee84eb3a4a54a31e2f4c5132ee5c latest_doc: False
DEBUG 2024-10-11 23:08:22,039 moin.utils.clock:48 timer protecting may(0): 131.00ms TestItem
DEBUG 2024-10-11 23:08:22,194 moin.utils.clock:48 timer Item init(0): 122.39ms name: TestItem query: {'namespace': '', 'name_exact': 'TestItem'} itemid: cb33ee84eb3a4a54a31e2f4c5132ee5c latest_doc: False
DEBUG 2024-10-11 23:08:22,194 moin.utils.clock:48 timer protecting may(0): 122.63ms TestItem
DEBUG 2024-10-11 23:08:22,196 moin.utils.clock:48 timer navibar(0): 0.15ms 
DEBUG 2024-10-11 23:08:22,323 moin.utils.clock:48 timer Item init(0): 126.49ms name: TestItem query: {'namespace': '', 'name_exact': 'TestItem'} itemid: cb33ee84eb3a4a54a31e2f4c5132ee5c latest_doc: False
DEBUG 2024-10-11 23:08:22,442 moin.utils.clock:48 timer Item init(0): 118.78ms name: TestItem query: {'namespace': '', 'name_exact': 'TestItem'} itemid: cb33ee84eb3a4a54a31e2f4c5132ee5c latest_doc: False
DEBUG 2024-10-11 23:08:22,559 moin.utils.clock:48 timer Item init(0): 116.36ms name: TestItem query: {'namespace': '', 'name_exact': 'TestItem'} itemid: cb33ee84eb3a4a54a31e2f4c5132ee5c latest_doc: False
DEBUG 2024-10-11 23:08:22,677 moin.utils.clock:48 timer Item init(0): 118.09ms name: TestItem query: {'namespace': '', 'name_exact': 'TestItem'} itemid: cb33ee84eb3a4a54a31e2f4c5132ee5c latest_doc: False
DEBUG 2024-10-11 23:08:22,678 moin.utils.clock:48 timer protecting may(0): 118.39ms TestItem
DEBUG 2024-10-11 23:08:22,678 moin.utils.clock:48 timer show item do_show(0): 2012.44ms TestItem rev_id=current

DEBUG 2024-10-11 23:08:22,679 moin.utils.clock:48 timer total(0): 2329.05ms /TestItem

Viewing the meta data of this item afterwards takes nearly the same time (2.5 secs).

UlrichB22 avatar Oct 11 '24 21:10 UlrichB22

To prove the impact of the redundant queries I wrote a short cache function in commit https://github.com/moinwiki/moin/commit/3e933ed81241d4663f81a28af9043fbbc9b1bded. This is experimental and should not be the solution. The result is looking like this

DEBUG 2024-10-11 23:24:28,422 moin.utils.clock:48 timer init(0): 4.27ms 
DEBUG 2024-10-11 23:24:28,589 moin.utils.clock:48 timer Item init(0): 164.12ms name: TestItem query: {'namespace': '', 'name_exact': 'TestItem'} itemid: cb33ee84eb3a4a54a31e2f4c5132ee5c latest_doc: False
DEBUG 2024-10-11 23:24:28,722 moin.utils.clock:48 timer Item init(0): 133.30ms name: None query: {'itemid': 'cb33ee84eb3a4a54a31e2f4c5132ee5c'} itemid: cb33ee84eb3a4a54a31e2f4c5132ee5c latest_doc: False
DEBUG 2024-10-11 23:24:28,723 moin.utils.clock:48 timer Revision init(0): 0.02ms name: ['TestItem'] revid: 308f87c48ab044058631c7007b7a5030
DEBUG 2024-10-11 23:24:28,723 moin.utils.clock:48 timer show item create(0): 298.86ms TestItem rev_id=current

DEBUG 2024-10-11 23:24:28,725 moin.utils.clock:48 timer conv_in_dom(0): 1.25ms 
DEBUG 2024-10-11 23:24:28,725 moin.utils.clock:48 timer nowiki(0): 0.03ms 
DEBUG 2024-10-11 23:24:28,725 moin.utils.clock:48 timer conv_include(0): 0.12ms 
DEBUG 2024-10-11 23:24:28,725 moin.utils.clock:48 timer conv_macro(0): 0.07ms 
DEBUG 2024-10-11 23:24:28,726 moin.utils.clock:48 timer conv_link(0): 0.05ms 
DEBUG 2024-10-11 23:24:28,728 moin.utils.clock:48 timer conv_dom_html(0): 2.13ms 

DEBUG 2024-10-11 23:24:29,017 moin.utils.clock:48 timer Item init(0): 0.02ms name: TestItem query: {'namespace': '', 'name_exact': 'TestItem'} itemid: cb33ee84eb3a4a54a31e2f4c5132ee5c latest_doc: False
DEBUG 2024-10-11 23:24:29,018 moin.utils.clock:48 timer Item init(0): 0.04ms name: TestItem query: {'namespace': '', 'name_exact': 'TestItem'} itemid: cb33ee84eb3a4a54a31e2f4c5132ee5c latest_doc: False
DEBUG 2024-10-11 23:24:29,019 moin.utils.clock:48 timer protecting may(0): 0.62ms TestItem
DEBUG 2024-10-11 23:24:29,021 moin.utils.clock:48 timer Item init(0): 0.05ms name: TestItem query: {'namespace': '', 'name_exact': 'TestItem'} itemid: cb33ee84eb3a4a54a31e2f4c5132ee5c latest_doc: False
DEBUG 2024-10-11 23:24:29,022 moin.utils.clock:48 timer protecting may(0): 0.68ms TestItem
DEBUG 2024-10-11 23:24:29,161 moin.utils.clock:48 timer Item init(0): 138.64ms name: /TestItem/Discussion query: {'name_exact': '/TestItem/Discussion', 'namespace': ''} itemid: undefined latest_doc: False
DEBUG 2024-10-11 23:24:29,289 moin.utils.clock:48 timer Item init(0): 126.96ms name: /TestItem/Discussion query: {'namespace': '', 'name_exact': '/TestItem/Discussion'} itemid: undefined latest_doc: False
DEBUG 2024-10-11 23:24:29,289 moin.utils.clock:48 timer Item init(0): 0.02ms name: /TestItem/Discussion query: {'namespace': '', 'name_exact': '/TestItem/Discussion'} itemid: undefined latest_doc: False
DEBUG 2024-10-11 23:24:29,289 moin.utils.clock:48 timer protecting may(0): 127.35ms /TestItem/Discussion
DEBUG 2024-10-11 23:24:29,289 moin.utils.clock:48 timer Item init(0): 0.01ms name: TestItem query: {'namespace': '', 'name_exact': 'TestItem'} itemid: cb33ee84eb3a4a54a31e2f4c5132ee5c latest_doc: False
DEBUG 2024-10-11 23:24:29,289 moin.utils.clock:48 timer protecting may(0): 0.13ms TestItem
DEBUG 2024-10-11 23:24:29,290 moin.utils.clock:48 timer Item init(0): 0.03ms name: TestItem query: {'namespace': '', 'name_exact': 'TestItem'} itemid: cb33ee84eb3a4a54a31e2f4c5132ee5c latest_doc: False
DEBUG 2024-10-11 23:24:29,291 moin.utils.clock:48 timer protecting may(0): 0.41ms TestItem
DEBUG 2024-10-11 23:24:29,291 moin.utils.clock:48 timer Item init(0): 0.03ms name: TestItem query: {'namespace': '', 'name_exact': 'TestItem'} itemid: cb33ee84eb3a4a54a31e2f4c5132ee5c latest_doc: False
DEBUG 2024-10-11 23:24:29,291 moin.utils.clock:48 timer protecting may(0): 0.37ms TestItem
DEBUG 2024-10-11 23:24:29,320 moin.utils.clock:48 timer Item init(0): 0.02ms name: TestItem query: {'namespace': '', 'name_exact': 'TestItem'} itemid: cb33ee84eb3a4a54a31e2f4c5132ee5c latest_doc: False
DEBUG 2024-10-11 23:24:29,320 moin.utils.clock:48 timer protecting may(0): 0.24ms TestItem
DEBUG 2024-10-11 23:24:29,322 moin.utils.clock:48 timer navibar(0): 0.25ms 
DEBUG 2024-10-11 23:24:29,325 moin.utils.clock:48 timer Item init(0): 0.09ms name: TestItem query: {'namespace': '', 'name_exact': 'TestItem'} itemid: cb33ee84eb3a4a54a31e2f4c5132ee5c latest_doc: False
DEBUG 2024-10-11 23:24:29,325 moin.utils.clock:48 timer Item init(0): 0.02ms name: TestItem query: {'namespace': '', 'name_exact': 'TestItem'} itemid: cb33ee84eb3a4a54a31e2f4c5132ee5c latest_doc: False
DEBUG 2024-10-11 23:24:29,325 moin.utils.clock:48 timer Item init(0): 0.01ms name: TestItem query: {'namespace': '', 'name_exact': 'TestItem'} itemid: cb33ee84eb3a4a54a31e2f4c5132ee5c latest_doc: False
DEBUG 2024-10-11 23:24:29,326 moin.utils.clock:48 timer Item init(0): 0.02ms name: TestItem query: {'namespace': '', 'name_exact': 'TestItem'} itemid: cb33ee84eb3a4a54a31e2f4c5132ee5c latest_doc: False
DEBUG 2024-10-11 23:24:29,326 moin.utils.clock:48 timer protecting may(0): 0.24ms TestItem
DEBUG 2024-10-11 23:24:29,327 moin.utils.clock:48 timer show item do_show(0): 603.51ms TestItem rev_id=current

DEBUG 2024-10-11 23:24:29,329 moin.utils.clock:48 timer total(0): 911.13ms /TestItem

The response time dropped to 900 ms, and the display of the metadata immediately afterwards takes around 100 ms.

I need help and support on this issue. I did not see any call to the Items init function where latest_doc was used. This parameter should avoid running the same query repeatedly.

UlrichB22 avatar Oct 11 '24 21:10 UlrichB22

Several of those queries are caused by user.may.write(fqname) in templates/itemviews.html. IMO it is the best solution to move this function call into apps/frontend/views.py and pass the result to the context of the template (same as item_is_deleted). Please advise.

I wonder what this line is intended for: https://github.com/moinwiki/moin/blob/807bfe10d56cf37e878667e0363849ab1f7e3df4/src/moin/templates/itemviews.html#L10 Is there any use case where the template is called for a non-existent element?

UlrichB22 avatar Oct 12 '24 21:10 UlrichB22

Good catch; I cannot think of a case where the template is called for a non-existent element.

Agree with your suggestion to move user.may.write to views.py and pass into context.

RogerHaase avatar Oct 13 '24 15:10 RogerHaase

The above fix reduces the response time in the above example to about 1500 ms - that is a reduction of 30 %. For smaller wikis, the reduction is only around 10 to 15 %.

UlrichB22 avatar Oct 16 '24 20:10 UlrichB22

Running a diff from an item history gives following traceback:

File "/mypath/moin/src/moin/templates/diff.html", line 6, in top-level template code
    {% extends theme("show.html") %}
    ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/mypath/moin/src/moin/templates/show.html", line 5, in top-level template code
    {% import "itemviews.html" as itemviews with context %}
    ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/mypath/moin-venv-python3/lib/python3.12/site-packages/jinja2/environment.py", line 1408, in make_module
    return TemplateModule(self, ctx)
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/mypath/moin-venv-python3/lib/python3.12/site-packages/jinja2/environment.py", line 1540, in __init__
    body_stream = list(template.root_render_func(context))
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/mypath/moin/src/moin/templates/itemviews.html", line 54, in top-level template code
    {%- if endpoint == 'frontend.modify_item' and may.write and not_trash %}
    ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/mypath/moin-venv-python3/lib/python3.12/site-packages/jinja2/environment.py", line 487, in getattr
    return getattr(obj, attribute)
           ^^^^^^^^^^^^^^^^^^^^^^^
jinja2.exceptions.UndefinedError: 'may' is undefined
ERROR moin Exception on /+diff/MoinMoinWiki [GET]

UlrichB22 avatar Oct 26 '24 18:10 UlrichB22

The index queries for /TestItem/Discussion above can be ommited if the Discussion feature is not used. The feature can be deactived by following setting in wikiconfig.py:

    # deactivate discussion feature
    supplementation_item_names = []

I will add comments in wikiconfig.py .

UlrichB22 avatar Nov 01 '24 20:11 UlrichB22

The number of item init() calls has been dropped down to 3 with the lastest git version.

My test results with commit https://github.com/moinwiki/moin/commit/e4e67a5fcb35461ab7058c54322227ebd17feaab from last year October compared to the git branch https://github.com/UlrichB22/moin/tree/add_whoosh_idx are

  • old: DEBUG 2025-04-16 22:51:46,374 moin.utils.clock:48 timer total(0): 2601.40ms /TestItem
  • new: DEBUG 2025-04-16 22:54:05,439 moin.utils.clock:48 timer total(0): 564.16ms /TestItem

The test wiki which has been converted from moin 1.9 has

  • 2200 items and
  • 15000 revisions

And the index sizes are:

  • 89M wiki/index/all_revs_2rujdcvxcgihfgc0.seg
  • 6M wiki/index/latest_idx_jaw8c1pm4dybmcmp.seg
  • 96M wiki/index/latest_revs_aktav89p03sv88ry.seg

UlrichB22 avatar Apr 16 '25 21:04 UlrichB22