nopCommerce icon indicating copy to clipboard operation
nopCommerce copied to clipboard

Improve performance of Sitemap.xml generation

Open RomanovM opened this issue 11 months ago • 1 comments

nopCommerce version: 4.60

I find us continuously encountering an issue related to the generation of the sitemap.xml file when dealing with substantial product catalogs, in the scale of hundreds of thousands of products.

At present, the solution seems to be to write custom code or use a distinct service to generate the sitemap.xml for such large-scale catalogs. This isn't the most efficient solution and it would be extremely beneficial if the system could handle it more seamlessly.

Would it be possible to enhance this feature in the upcoming releases of nopCommerce? It would greatly improve our workflow and benefit other users with large catalogs as well.

Source: https://www.nopcommerce.com/boards/topic/97598/request-for-sitemapxml-generation-improvement-in-future-nopcommerce-version

RomanovM avatar Jul 18 '23 17:07 RomanovM

In my experience with massively parallelized processes like this, the 'low hanging fruit' is usually repeated calls to a function in child processes that do not require the context of the child.

Just briefly reviewing the code...one possible optimization would be pulling

//~line 821
            var store = await _storeContext.GetCurrentStoreAsync();

            var languages = _localizationSettings.SeoFriendlyUrlsForLanguagesEnabled
                ? await _languageService.GetAllLanguagesAsync(storeId: store.Id)
                : null;

up from PrepareLocalizedSitemapUrlAsync(string routeName...) to the parent function of GenerateUrlsAsync() then passing it down to PrepareLocalizedSitemapUrlAsync(... and the other functions that call it, like GetCategoryUrlsAsync().

I realize that these calls are pretty lightweight, but getting the store context and languages for the store 1 time is certainly better than n times.

Another, more complicated idea: From what i understand, any 'extra' properties provided in the sitemap file will just be ignored by indexers, so the entity id could be included in the sitemap and then the existing Sitemap.xml file could be parsed as a cache and use the lastmod property as a comparer for the entity, The existing entry for products, categories and manufacturers that haven't been changed since the sitemap was last generated could be used instead of actually regenerating the route. Whether or not this extra logic would be faster and the sitemap's accuracy maintained would have to be determined.

danFbach avatar Jul 18 '23 18:07 danFbach