Filling `name` causes thousands of queries
Short description of the issue
When there are n pages with the same title, finding a name for the next one causes n queries. This is slow when n>1000.
Expected behavior
I hope there was a way to give up on sequential numbering after a configurable number of tries.
Actual behavior
I have thousands of pages that are created without an explicit, unique(ish) title. Instead, each page will have something like Board member as its title. When a new page is created, PagesNames.php tries to find a name for the new page, causing thousands of queries.
SELECT COUNT(*) FROM pages WHERE name='board-member-1799' AND parent_id='1234'
SELECT COUNT(*) FROM pages WHERE name='board-member-1800' AND parent_id='1234'
SELECT COUNT(*) FROM pages WHERE name='board-member-1801' AND parent_id='1234'
Yes, I could probably fix this outside of PagesNames.php, but this would require lots of changes in our large codebase. I'd like to use a fix that fixes the problem in all occasions. Currently, I've worked around this with a patch:
--- wire/core/PagesNames.php 2024-11-13 21:16:53
+++ wire/core/PagesNames.php 2024-11-27 14:50:12
@@ -491,8 +491,16 @@
}
}
+ $counter = 1;
+ $originalName = $name;
while($this->pageNameExists($name, $options)) {
- $name = $this->incrementName($name);
+ if(++$counter > 5) {
+ // If we’ve tried 5 times and still can’t get a unique name, just append a random number.
+ // This prevents pageNameExists() from being called thousands of times.
+ $name = $originalName . $this->delimiter . mt_rand(1e9, 9e9);
+ } else {
+ $name = $this->incrementName($name);
+ }
}
if(strlen($name) > $this->nameMaxLength) $name = $this->adjustNameLength($name);
Can you think of a better fix?
Steps to reproduce the issue
-
Create 10 pages with the same title
-
When creating the 10th page, observe SQL queries like these:
... SELECT COUNT() FROM pages WHERE name='foo-7' AND parent_id='1234' SELECT COUNT() FROM pages WHERE name='foo-8' AND parent_id='1234' SELECT COUNT(*) FROM pages WHERE name='foo-9' AND parent_id='1234'
Environment
- ProcessWire version: newest dev
@tuomassalo, please verify the fix.
It seems that there two different paths that lead to the large number of queries.
I don't recall now how to reproduce the problem I described in the ticket, but now that I test the suggested fix with the code below, I don't think it does the trick here. Test code:
<?php
include('/var/www/html/index.php');
$queries = $wire->database->queryLog(true); // empty the log
for($i = 0; $i < 10; $i++) {
$page = new \ProcessWire\Page;
$page->template = 'basic-page';
$page->parent = '/';
$page->name = "issue2003";
$page->title = "Child $i";
$page->save();
}
$queries = $wire->database->getQueryLog();
echo '<p>Number queries: ' . count($queries) . '</p>';
foreach($queries as $query){
echo '<p>' . htmlspecialchars($query) . '</p>';
}
With the suggested fix, the code produces the output below, which I think is unexpected:
[...]
SELECT id, status, parent_id FROM pages WHERE name='issue2003' AND id!=0 -- [0.1ms]
SELECT id, status, parent_id FROM pages WHERE name='issue2003-1' AND id!=0 -- [0.1ms]
SELECT id, status, parent_id FROM pages WHERE name='issue2003-2' AND id!=0 -- [0.1ms]
SELECT id, status, parent_id FROM pages WHERE name='issue2003-3' AND id!=0 -- [0.1ms]
SELECT id, status, parent_id FROM pages WHERE name='issue2003-4' AND id!=0 -- [0.1ms]
SELECT id, status, parent_id FROM pages WHERE name='issue2003-5' AND id!=0 -- [0.1ms]
SELECT id, status, parent_id FROM pages WHERE name='issue2003-6' AND id!=0 -- [0.1ms]
SELECT id, status, parent_id FROM pages WHERE name='issue2003-7' AND id!=0 -- [0.1ms]
SELECT id, status, parent_id FROM pages WHERE name='issue2003-8' AND id!=0 -- [0.1ms]
SELECT id, status, parent_id FROM pages WHERE name='issue2003-9' AND id!=0 -- [0.1ms]
[...]
So, I think there's another scenario where the patch does work.