winter icon indicating copy to clipboard operation
winter copied to clipboard

Optimise CMS page loading by simplifying model query

Open jaxwilko opened this issue 3 years ago • 12 comments

This PR adds the method CmsObject::listInThemeArray which can be used to return an Collection of arrays containing filenames and url patterns.

Each element of the collection will look something like this:

[
    'fileName' => 'example.htm',
    'pattern' => '/example/url'
]

The router has been changed to call this method instead of listInTheme.

In short this means that instead of returning a collection of page objects and storing only filename & url, we can just retrieve the filename and url and directly store them as the url map in the router.

The speed results with these changes are as follows:

Test results

Tests have been done isolating the the time it takes for $page = $this->router->findByUrl($url); within Cms\Classes\Controller to execute. All tests have been done with 1000 CMS pages.

jaxwilko avatar May 27 '21 13:05 jaxwilko

@jaxwilko when this is ready for review, feel free to take it out of draft :)

bennothommo avatar Jun 17 '21 08:06 bennothommo

@bennothommo my bad, didn't realise it was still in draft lol

jaxwilko avatar Jun 17 '21 09:06 jaxwilko

@jaxwilko it wasn't - I put it in draft. It still had [WIP] in the title though :P

bennothommo avatar Jun 17 '21 09:06 bennothommo

This pull request will be closed and archived in 3 days, as there has been no activity in the last 60 days. If this is still being worked on, please respond and we will re-open this pull request. If this pull request is critical to your business, consider joining the Premium Support Program where a Service Level Agreement is offered.

github-actions[bot] avatar Aug 17 '21 00:08 github-actions[bot]

Bump

jaxwilko avatar Aug 17 '21 00:08 jaxwilko

This pull request will be closed and archived in 3 days, as there has been no activity in the last 60 days. If this is still being worked on, please respond and we will re-open this pull request. If this pull request is critical to your business, consider joining the Premium Support Program where a Service Level Agreement is offered.

github-actions[bot] avatar Oct 17 '21 00:10 github-actions[bot]

I've had a test of this - it does work, but I only see a very small performance benefit. I imagine it's more pronounced when there's more pages to read, but when there's only a handful of pages, it's minuscule.

I just had an idea though - given that for this (and our routing in general), we only want the filename and URL to be retrieved, could we be clever and format the URL to be the first line and just read the first line of pages to determine the URL? It would save us having to parse the sections on each page in order to retrieve the URL, and might make a larger performance boost.

We could maintain backwards compatibility by storing the URL as a "hashbang" of sorts - if the first line doesn't match the format, then assume it's still the old format and would need to be parsed for each of the sections, but we could perhaps save it in this new format on save, going forward?

Eg.

#</test-page
title = "Test Page"
layout = "home"
==
Content

As opposed to the current:

title = "Test Page"
url = "/test-page"
layout = "home"
==
Content

bennothommo avatar Dec 10 '21 08:12 bennothommo

Or even better, could we create a URL cache file and have all CMS objects register a URL in this cache along with a reference to the file name (and perhaps object class) and just use this for routing? Saves us reading all the files altogether.

bennothommo avatar Dec 10 '21 08:12 bennothommo

While I was testing I had 1000 CMS pages which each had a img tag with a base64 encoded image to simulate medium sized html content (in terms of bytes). With less pages this issue isn't massively apparent.

It was a while ago i looked at this, but the parser is actually super efficent and just opening the file and reading doing something like:

$fp = fopen($file, 'r');
while ($line = fgets($fp) && !feof($fp) && str_starts_with($line, 'url =')) {
    fclose($fp);
    return trim(explode('=', $line)[1]);
}

Was not really even quicker than just calling the parser.

The issue comes from creating a Halcyon model for each page (Cms\Classes\CmsObject::listInTheme), then discarding the model straight away (Cms\Classes\Router::getUrlMap) which means there was no point to wasting time and memory initing the models.

jaxwilko avatar Dec 10 '21 09:12 jaxwilko

@bennothommo for multilingual sites, the translated url in the viewBag needs to be read though...

mjauvin avatar Dec 10 '21 12:12 mjauvin

How does this compare to just using the route cache on the next couple of lines? If we have problems with cache invalidation then that's a simple fix, just trigger cache clears whenever the cache would be invalidated

LukeTowers avatar Dec 10 '21 17:12 LukeTowers

This pull request will be closed and archived in 3 days, as there has been no activity in the last 60 days. If this is still being worked on, please respond and we will re-open this pull request. If this pull request is critical to your business, consider joining the Premium Support Program where a Service Level Agreement is offered.

github-actions[bot] avatar Apr 04 '22 00:04 github-actions[bot]

This pull request will be closed and archived in 3 days, as there has been no activity in this pull request for the last 6 months. If you intend to continue working on this pull request, please respond within 3 days. If this pull request is critical for your business, please reach out to us at [email protected].

github-actions[bot] avatar Mar 14 '23 00:03 github-actions[bot]