joomla-cms
joomla-cms copied to clipboard
[5.2] SEF: Enforcing correct SEF URL
Summary of Changes
Joomla has been improving its SEO performance constantly and one issue which is still open is, that you can reach the same page via different URLs. This means that you can reach the same article both via the wanted URL, but also via the /component/content/article/<id>
URL or even via index.php?option=com_content&view=article&catid=<catid>&id=<id>
.
This PR adds a new feature which redirects URLs with query parameters to their SEF variant. The router parses the URL and if the original URL contained query parameters, it tries to create a new URL from the information it parsed and if that new URL does not contain any query parameters anymore, it does a redirect to that URL. This will remove a lot of the double URLs for one content item. This only happens for GET requests.
It should be noted, that this does NOT remove every case of duplicate content. For example additional URL parameters are still kept.
I'd like to thank ithelps Digital for sponsoring this feature.
Testing Instructions
- Apply the patch
- Use some content to test this with, for example the testing sample data.
- Go to
/park-blog/first-blog-post
in your site when you use the testing sample data - Add a
?id=18
to the URL and see that you actually get the "Second Blog Post" displayed instead of the first one. - Enable the option in the SEF plugin.
- Reload the frontend page and see that you are redirected to
/park-blog/second-blog-post
.
Link to documentations
Please select:
-
[X] Documentation link for docs.joomla.org: https://docs.joomla.org/Search_Engine_Friendly_URLs
-
[ ] No documentation changes for docs.joomla.org needed
-
[ ] Pull Request link for manual.joomla.org:
-
[ ] No documentation changes for manual.joomla.org needed
I see what you want to achieve here, but:
- how can a new feature not require documentation (especially such a critical?)
- the whole feature sounds like it's solved when we implement canonical tags
Sorry about the documentation. It was late yesterday and I will create documentation for the whole SEF thing in the next 48 hours.
Regarding the canonical tag I would disagree. To me, the canonical tag is just a bandaid to get around using the right URL and thus using the right URL in the output is the first step (which we do in Joomla anyway). The second step would then be to redirect wrong URLs to the right ones instead of keeping them and slapping a canonical on there.
I have tested this item :white_check_mark: successfully on 6ae36cb3509f4b4bf8ff2f3c323ca257401c7cf6
This comment was created with the J!Tracker Application at issues.joomla.org/tracker/joomla-cms/42854.
I have tested this item :white_check_mark: successfully on 6ae36cb3509f4b4bf8ff2f3c323ca257401c7cf6
This comment was created with the J!Tracker Application at issues.joomla.org/tracker/joomla-cms/42854.
RTC
This comment was created with the J!Tracker Application at issues.joomla.org/tracker/joomla-cms/42854.
Link to documentation has been added.
I've played around with this PR using the "testing" dataset in Joomla 5.1.x.
I've run into an edge case:
- The testing dataset has an article that isn't accessible via it's own menu item but only via a category list, it's article ID 1, " Administrator Components": index.php/content-component/article-category-list/extensions/components/administrator-components
- If I now open the "Archive Module" menu item (/index.php/content-component/archive-module) and add an extra query parameter to override the article id with the ID 1 (/index.php/content-component/archive-module?id=1) I'm receiving a redirect to this URL: /index.php/content-component/archive-module?view=article&id=1, whereas my expectation would have been a redirect to the category list URL mentioned above.
I'm aware that this is not quirk of the current PR but of the router in general. However, the current PR might increase the effects.
That's why I'm asking myself if we should limit the redirection to "clean" URLs without any leftover query parameters.
Thoughts? @Hackwar
I agree, this sounds good. I'll add this to the PR.
I encountered additional issues and will convert this back to draft first.
Back to pending.
This comment was created with the J!Tracker Application at issues.joomla.org/tracker/joomla-cms/42854.
What about a generic fetch/xhr request to index.php?option=com_xxx&... Do you mean that with this option enabled it will no longer be possible to perform a GET request to a raw Joomla URL? This would be too much generic.
What about a generic fetch/xhr request to index.php?option=com_xxx&...
Fetch / XHR requests do follow redirects automatically.
it works, perfect
This comment was created with the J!Tracker Application at issues.joomla.org/tracker/joomla-cms/42854.
I have tested this item :white_check_mark: successfully on 33d1153c7624eeba138f993783eb3abd907ba79a
I've tested the following scenarios:
- local installation for test dataset, crawled whole site before applying the patch and enabling the feature - after applying the patch, i recrawled the site and got no 404s
- I've tried various non-sef URL pattern (index.php?Itemid=..., index.php?option=com_users&view=reset&Itemid=...) and got the expected result
- I've tested the example provided in the test instructions
This comment was created with the J!Tracker Application at issues.joomla.org/tracker/joomla-cms/42854.
I refactored the PR to use the new option from #43432 to reduce the number of options we add here.
I have not tested this item.
Sorry, i can't reproduce the error with the given instructions. I always be redirectetd to the error 404 page.
This comment was created with the J!Tracker Application at issues.joomla.org/tracker/joomla-cms/42854.
I have tested this item :white_check_mark: successfully on 09f8ba345508d55ca324fa01218c5eef2f191869
I tested this using configuration which resulted in a /component/content URL. Then I enabled a menuitem which matched this URL and the code successfully did a 301 redirect to the SEF URL for that menuitem. I repeated this for several cases and they all worked ok.
I also set up a site module which was a list of articles in a given category. The URLs of the articles were based on a category list menuitem, and when I clicked on one of the links it routed me correctly to the article.
I then set up a menuitem pointing directly at the single article and reloaded the site page. It didn't redirect to the new menuitem (as there weren't any query parameters on the URL). But when the module displayed the links it utilised the menuitem for that single article.
This comment was created with the J!Tracker Application at issues.joomla.org/tracker/joomla-cms/42854.
RTC
This comment was created with the J!Tracker Application at issues.joomla.org/tracker/joomla-cms/42854.
Thanks @Hackwar !