processwire-issues icon indicating copy to clipboard operation
processwire-issues copied to clipboard

Filling `name` causes thousands of queries

Open tuomassalo opened this issue 1 year ago • 2 comments

Short description of the issue

When there are n pages with the same title, finding a name for the next one causes n queries. This is slow when n>1000.

Expected behavior

I hope there was a way to give up on sequential numbering after a configurable number of tries.

Actual behavior

I have thousands of pages that are created without an explicit, unique(ish) title. Instead, each page will have something like Board member as its title. When a new page is created, PagesNames.php tries to find a name for the new page, causing thousands of queries.

SELECT COUNT(*) FROM pages WHERE name='board-member-1799' AND parent_id='1234'
SELECT COUNT(*) FROM pages WHERE name='board-member-1800' AND parent_id='1234'
SELECT COUNT(*) FROM pages WHERE name='board-member-1801' AND parent_id='1234'

Yes, I could probably fix this outside of PagesNames.php, but this would require lots of changes in our large codebase. I'd like to use a fix that fixes the problem in all occasions. Currently, I've worked around this with a patch:

--- wire/core/PagesNames.php	2024-11-13 21:16:53
+++ wire/core/PagesNames.php	2024-11-27 14:50:12
@@ -491,8 +491,16 @@
 			}
 		}
 		
+		$counter = 1;
+		$originalName = $name;
 		while($this->pageNameExists($name, $options)) {
-			$name = $this->incrementName($name);
+			if(++$counter > 5) {
+				// If we’ve tried 5 times and still can’t get a unique name, just append a random number.
+				// This prevents pageNameExists() from being called thousands of times.
+				$name = $originalName . $this->delimiter . mt_rand(1e9, 9e9);
+			} else {
+				$name = $this->incrementName($name);
+			}
 		}
 		
 		if(strlen($name) > $this->nameMaxLength) $name = $this->adjustNameLength($name);

Can you think of a better fix?

Steps to reproduce the issue

  1. Create 10 pages with the same title

  2. When creating the 10th page, observe SQL queries like these:

    ... SELECT COUNT() FROM pages WHERE name='foo-7' AND parent_id='1234' SELECT COUNT() FROM pages WHERE name='foo-8' AND parent_id='1234' SELECT COUNT(*) FROM pages WHERE name='foo-9' AND parent_id='1234'

Environment

  • ProcessWire version: newest dev

tuomassalo avatar Nov 28 '24 10:11 tuomassalo

@tuomassalo, please verify the fix.

matjazpotocnik avatar Oct 04 '25 18:10 matjazpotocnik

It seems that there two different paths that lead to the large number of queries.

I don't recall now how to reproduce the problem I described in the ticket, but now that I test the suggested fix with the code below, I don't think it does the trick here. Test code:

<?php
include('/var/www/html/index.php');

$queries = $wire->database->queryLog(true); // empty the log

for($i = 0; $i < 10; $i++) {
    $page = new \ProcessWire\Page;
    $page->template = 'basic-page';
    $page->parent = '/';
    $page->name = "issue2003";
    $page->title = "Child $i";
    $page->save();
}

$queries = $wire->database->getQueryLog();
echo '<p>Number queries: ' . count($queries) . '</p>';
foreach($queries as $query){
    echo '<p>' . htmlspecialchars($query) . '</p>';
}

With the suggested fix, the code produces the output below, which I think is unexpected:

[...]
SELECT id, status, parent_id FROM pages WHERE name='issue2003' AND id!=0 -- [0.1ms]
SELECT id, status, parent_id FROM pages WHERE name='issue2003-1' AND id!=0  -- [0.1ms]
SELECT id, status, parent_id FROM pages WHERE name='issue2003-2' AND id!=0  -- [0.1ms]
SELECT id, status, parent_id FROM pages WHERE name='issue2003-3' AND id!=0  -- [0.1ms]
SELECT id, status, parent_id FROM pages WHERE name='issue2003-4' AND id!=0  -- [0.1ms]
SELECT id, status, parent_id FROM pages WHERE name='issue2003-5' AND id!=0  -- [0.1ms]
SELECT id, status, parent_id FROM pages WHERE name='issue2003-6' AND id!=0  -- [0.1ms]
SELECT id, status, parent_id FROM pages WHERE name='issue2003-7' AND id!=0  -- [0.1ms]
SELECT id, status, parent_id FROM pages WHERE name='issue2003-8' AND id!=0  -- [0.1ms]
SELECT id, status, parent_id FROM pages WHERE name='issue2003-9' AND id!=0  -- [0.1ms]
[...]

So, I think there's another scenario where the patch does work.

tuomassalo avatar Oct 06 '25 14:10 tuomassalo