dokuwiki-plugin-move icon indicating copy to clipboard operation
dokuwiki-plugin-move copied to clipboard

Relative links are modified incorrectly

Open jerry32j opened this issue 4 years ago • 14 comments

I've written a thread about my issue here, and this plugin seems to be the cause.

Here are the key points:

In Preview mode the links reach the intended pages and all is well! So I hit Save and review the page and find that all the links have been mysteriously changed. I go to view the page source to see what happened, and sure enough the links are all different from what I wrote. This does not occur if I just write a fully qualified path to the desired page, only when I start using dots and colons.

Here are some examples: [[..:]] becomes [[:]] [[.:some_page:]] becomes [[..:some_page:]]

After process of elimination, I've found that the Move plugin appears to be responsible. When disabled, I am able to use the relative links as expected. When enabled, links are retroactively modified even before editing an affected page. I'm not sure whether this operation is carried out upon page read or plugin initialization.

jerry32j avatar Mar 11 '21 00:03 jerry32j

Is that only when you create a new page or does that happen repeatedly for the same page? Did you execute any move operations that could somehow explain this change? In theory the move plugin should modify links that are affected by a move operation once and then remove the information about the move. The modification should happen when the page is read for the first time after the move operation. As your report shows, somehow something must be broken here.

If you can reproduce this several times on the same page, it would be helpful if you could send me the .meta file associated with the affected page. You can find that file in data/meta/YOUR_NAMESPACE/SECOND_NAMESPACE/PAGE.meta, where the parts in caps are replaced by your actual namespace and page name. This file contains parts of text of the page, information about links and the users who created and modified the page. You can look at the file in a text editor to see what it actually contains. If you do not feel comfortable posting that file here, you can also send it to me privately via email and I promise not to publish it anywhere. You can also just extract a part of the file, there should be the string plugin_move somewhere and after that there is a { - everything after that till the matching closing } is the interesting part.

michitux avatar Mar 11 '21 10:03 michitux

It happens repeatedly for the same page. It appears not to modify the actual source code of the wiki page, since disabling the plugin reverts the changes, but it does change the code that appears in the editor when you edit an article.

I am having trouble reproducing the issue. I have an affected article that confirms the issue is present, and I've attempted to recreate the directory structure and am finding links are only modified in a way that correctly links to the intended article.

 

Here is the first snippet from /data/meta/Universes/Zero/Overview.meta. This is the affected page.

s:6:"origin";s:14:"Universes:Zero";s:5:"pages";a:3:{i:0;a:2:{i:0;s:14:"Universes:Zero";i:1;s:23:"Universes:Zero:Overview";}i:1;a:2:{i:0;s:9:"Universes";i:1;s:18:"Universes:Overview";}i:2;a:2:{i:0;s:22:"Universes:Zero:The_Hub";i:1;s:31:"Universes:Zero:The_Hub:Overview";}}s:5:"media";a:0:{}

This is a second (identical) plugin_move entry from the same file.

s:6:"origin";s:14:"Universes:Zero";s:5:"pages";a:3:{i:0;a:2:{i:0;s:14:"Universes:Zero";i:1;s:23:"Universes:Zero:Overview";}i:1;a:2:{i:0;s:9:"Universes";i:1;s:18:"Universes:Overview";}i:2;a:2:{i:0;s:22:"Universes:Zero:The_Hub";i:1;s:31:"Universes:Zero:The_Hub:Overview";}}s:5:"media";a:0:{}

This page contains [[..:|universe]] and [[.:The_Hub:]]. These refer to /Universes/Overview and /Universes/Zero/The_Hub/Overview, respectively.

When this plugin is enabled, these links are changed to [[.:|universe]] and [[..:The_Hub:]], which incorrectly refer to /Overview and /Universes/The_Hub/Overview.

 

This is the first plugin_move entry from /data/meta/Universes/Overview.meta.

s:6:"origin";s:9:"Universes";s:5:"pages";a:2:{i:0;a:2:{i:0;s:14:"Universes:Zero";i:1;s:23:"Universes:Zero:Overview";}i:1;a:2:{i:0;s:9:"Universes";i:1;s:18:"Universes:Overview";}}s:5:"media";a:0:{}

And the second (also identical).

s:6:"origin";s:9:"Universes";s:5:"pages";a:2:{i:0;a:2:{i:0;s:14:"Universes:Zero";i:1;s:23:"Universes:Zero:Overview";}i:1;a:2:{i:0;s:9:"Universes";i:1;s:18:"Universes:Overview";}}s:5:"media";a:0:{}

  From /data/meta/Universes/Zero/The_Hub/Overview.meta:

s:6:"origin";s:3:"hub";s:5:"pages";a:4:{i:0;a:2:{i:0;s:3:"hub";i:1;s:18:"Universes:Zero:hub";}i:1;a:2:{i:0;s:18:"Universes:Zero:hub";i:1;s:22:"Universes:Zero:The_Hub";}i:2;a:2:{i:0;s:14:"Universes:Zero";i:1;s:23:"Universes:Zero:Overview";}i:3;a:2:{i:0;s:22:"Universes:Zero:The_Hub";i:1;s:31:"Universes:Zero:The_Hub:Overview";}}s:5:"media";a:0:{}

And, finally, the second (also identical) snippet:

s:6:"origin";s:3:"hub";s:5:"pages";a:4:{i:0;a:2:{i:0;s:3:"hub";i:1;s:18:"Universes:Zero:hub";}i:1;a:2:{i:0;s:18:"Universes:Zero:hub";i:1;s:22:"Universes:Zero:The_Hub";}i:2;a:2:{i:0;s:14:"Universes:Zero";i:1;s:23:"Universes:Zero:Overview";}i:3;a:2:{i:0;s:22:"Universes:Zero:The_Hub";i:1;s:31:"Universes:Zero:The_Hub:Overview";}}s:5:"media";a:0:{}

jerry32j avatar Mar 11 '21 11:03 jerry32j

Is it possible that the file data/locks/_plugin_move.lock exists? This might be the case if you started some move operation in the tree-based move manager that did not finish. As far as I understand the code (I did not write that part myself) then the move plugin will never save the rewritten content and will not clear the metadata. Still, I think this does not fully explain the problem as the recorded move operations seem unrelated, they just explain why the changes are applied again and again and not saved.

Why are there uppercase letters in your page ids? This should not be possible without modifying DokuWiki, which might introduce all kinds of bugs. Is it possible that you modified some functions but not others? Such that, e.g., cleanID, getNS, noNS or resolve_pageid transform the link to lowercase and then parts do not match anymore as intended. I have not checked which of these methods does that by default, they just seem to be the most likely candidates involved. The method relativeLink is by default applied to all links. It should not modify links if they still resolve to the same page - which should be the case if there was no matching move - and I suspect that something that has been changed in your wiki breaks these checks.

michitux avatar Mar 11 '21 15:03 michitux

There is no such file; the only ones I have in there are the _dummy file and some lock files with hash names. I have not yet made use of the tree-based move manager. So far the only move operations I've performed were with the Rename Article toolbar button.

I have made changes to some of the DokuWiki PHP files to permit capitalization, and I recognize that this does have a "warranty-voiding" effect. One of the steps I took while I was troubleshooting yesterday was to perform a Wiki Upgrade from the ACP, and I verified afterward that my modifications were indeed reverted. The affected pages still show this unexpected behavior (even resolving to the same patterns, albeit with lowercase article IDs).

Unfortunately, I have no idea what else I can do to try and reproduce this issue in any kind of structured way. I feel like all I can do at this point is make a bunch of interlinked articles and randomly move them around until something breaks and try to guess at what happened. If you have any insight that may help me discover a reproducible case, it'd be much appreciated.

jerry32j avatar Mar 11 '21 16:03 jerry32j

Okay, thank you very much for testing all this and providing all this information. I currently see (possibly) two bugs here:

  1. For some reason, the move plugin decides that it is not safe to modify the pages you are viewing even though it probably should and thus does not clear its metadata (which is why the modifications are only temporary). Normally, this should only happen when the page is locked and modifications should then be applied and the metadata cleared after the lock is released - I am not sure why #191 reports something different, the code for this exists. Maybe there is some flaw in the logic that decides when edits can be applied safely. Apart from closely looking at the code again I have not really an idea how to debug this unless this can be reproduced. However, #151 sounds as if it was the same issue. In my local wiki where I have also moved pages using the move plugin I can clearly see that the changes were saved (they are visible as separate versions of the pages) and removed from the metadata so this only seems to happen under certain circumstances.
  2. There might be a problem with the algorithm for adjusting relative links. I have no idea why or how this should happen as there are rather early checks that should prevent any change for links that are still valid. The logic for the actual change seems rather complex though and at first glance I did not understand how it works. Understanding it again and writing tests for it might help. This is something I can do.

I cannot exclude that relative links are used rarely enough (if you use the link wizard, you always get absolute links) that the second problem has not been noticed before, in particular if it is not easily reproducible.

michitux avatar Mar 11 '21 17:03 michitux

I have just got an idea that might explain both issues. Is there something in your setup that might lead to strange paths to the data directory, e.g., do you use a custom path to the data directory that is possibly relative or uses symbolic links? Could you test if the code in lib/plugins/move/action/rewrite.php line 47 is executed and if in particular the check in line 50 is failing? (Just output some message using the msg() function.) It seems absolutely wrong to continue if that check is false, as we have no page id and are rewriting relative links relative to no page id. However, this might actually explain the behavior you are seeing.

michitux avatar Mar 11 '21 17:03 michitux

I'm using the Bitnami DokuWiki Stack, which seems to have things organized in a way that is somewhat different from the standard, though I haven't actually compared it to a plain DokuWiki instance to confirm/deny that observation. There are no shortcut folders or anything like that as far as I've seen, but I'm not sure that's the only thing you're asking about?

The following code is reached but evidently !$id is not true given that a message at the top of the inner block is not reached. Of course, the check on line 50 is also not reached.

if(!$id) {
	// try to reconstruct the id from the filename
	$path = $event->data[0][0];
	if(strpos($path, $conf['datadir']) === 0) {
		$path = substr($path, strlen($conf['datadir']) + 1);
		$id   = pathID($path);
	}
}

Interestingly, and this is making me suspicious that this may after all be related to my meddling with DokuWiki's decapitalization, $id is universes:zero:overview.

jerry32j avatar Mar 11 '21 17:03 jerry32j

I totally missed something in the data you provided. The metadata of the page Universes:Zero:Overview says that this page got moved from Universes:Zero. Accordingly, the move plugin thinks that the links [[..:|universe]] and [[.:The_Hub:]] are still relative to Universes:Zero and thus it interprets them as Overview and Universes:The_Hub:Overview and adjusts their syntax accordingly. So I do not believe there is anything wrong with the logic for adapting relative links. It's just the metadata somehow did not get cleared after the change had been applied for the first time.

Is it possible that the move plugin wrote a page universes:zero:overview (i.e., in lowercase)? If not, I would suggest to check the following (all in the move plugin):

  1. In action/rewrite.php, is $save set to true in handle_read?
  2. In helper/rewrite.php, is $save set to true in rewritePage?
  3. In helper/rewrite.php, is the call to unsetMoveMeta in line 285 reached?

If all of this is the case, I suspect that it is simply that some of these things are executed with the lowercase id. Though it cannot be everything, because otherwise loading the metadata would not succeed in the first place. Further, the metadata has obviously been written in the first case using the same function in DokuWiki.

michitux avatar Mar 11 '21 21:03 michitux

There exists no such file /data/pages/universes/zero/overview.txt. The URL is case-insensitive, so it leads to the same page. I'm not sure how else to confirm/deny the existence of such a page; there are files that fit this description in /data/attic, of course, but I suspect these are merely acknowledgements that such a page did once exist.

At action/rewrite.php:39 we see this block. The check passes and $save = false is executed.

// only rewrite if not in move already
$save = true;
	if(helper_plugin_move_rewrite::isLocked()) {
	$save = false;
}

helper/rewrite.php:264 has a function signature with the default $save parameter set to true, but it has been passed false. This outcome appears to happen regardless of which page is viewed.

public function rewritePage($id, $text = null, $save = true) {

helper/rewrite.php:276 is reached without ever modifying this variable or using it in any way. Consequently, this check fails and helper/rewrite.php:285 is not reached.

if ($save === true) {

jerry32j avatar Mar 12 '21 01:03 jerry32j

Is it possible that a file data/locks_plugin_move.lock (note the missing /) exists? This file (or the file inside the lock directory, but you already confirmed that that file doesn't exist) is also created when a single page is moved. It should be removed again after the move, but if the move is aborted for any reason this might not happen. [Edit] If the file exists, could you please also look at its content? It should be a single number. This number would basically indicated how many move operations haven't been completed.

michitux avatar Mar 12 '21 07:03 michitux

There is such a file, and it contains the number 3.

jerry32j avatar Mar 12 '21 09:03 jerry32j

Great, so we have finally found the reason for the strange behavior of the move plug-in. Have you moved just three pages, i.e., did every single move basically fail or have you moved significantly more pages so just a few of them failed to unlock again (every unlock decrements the number in the file)?

If you delete that file, the move plug-in should make those changes that you see permanent, you should then see a new revision of these pages. If your are not happy with the changes, you should be able to revert to the previous revision. The move plug-in should then also remove the metadata that you saw before and thus not apply the changes again.

michitux avatar Mar 12 '21 10:03 michitux

I'm not sure; I hadn't noticed the issue until quite a while after I began using the plugin. I've definitely made more than three moves, though, especially when trying to find the reproduction steps.

Deleting /data/locks_plugin_move.lock indeed permitted me to correct the links on all affected pages without my changes being reverted. Now the question that remains is what went wrong to begin with that caused this?

jerry32j avatar Mar 12 '21 10:03 jerry32j

I assume that some move operation did not complete successfully, maybe it aborted with an error and due to this the code that should have removed the lock hasn't been executed. I have just had a short look at the code and I fear that this can occur quite easily, even in cases where the move is intentionally not executed due to errors in the first steps. I suggest the following changes:

  • [ ] Ensure in helper_plugin_move_op::movePage that the lock is always released even when the method returns early.
  • [ ] Change the file name so the lock is inside data/locks/. This will also "fix" this problem for installations where the lock exists.
  • [ ] Display an error message to admin users when the lock exists and is older than, e.g., 10 minutes. The error message should also contain a link with instructions how to remove the lock. Maybe this will be possible via the admin interface.
  • [ ] Hide the rename page button when the lock exists. The idea is that it might be easier then to find out what happened as it is clear that the last move operation caused the problem. Further, this limits the impact to just the links on and to this last move page and avoids that problems all around the wiki are created.
  • [ ] Find out if there are any problems when a page is edited while the lock exists.

A possible solution might also be to just remove the locking from the single page moves, I think this lock is primarily meant for moves via the admin interface. There, the lock is completely removed after the move has been finished but it might still be a good idea to display a message when an old lock exists so the user can continue or abort that move operation as otherwise the same problems that you noticed will occur.

Unfortunately, I cannot give any estimate when I will have the time to work on this as working on DokuWiki (plugins) is not my job and my free time is mostly filled with other things.

michitux avatar Mar 20 '21 17:03 michitux