Create cron scripts for catalog feeds
Live feeds (every hit new feed is generated) are fast only for very small catalogs. For larger catalogs there should be some non-public cron script which will regenerate public feed (xml).
maybe enable the caching system in xml generation with a short period of time
I have done this for Google Sitemap Multilanguage module (no idea, why I have not published yet ... ).
So, the sitemap.xml is generated on the server, saved in root /sitemap.xml, but the URL is forwarded to the controller through .htaccess.
If the file is older than 2h, the request will regenerate it. I will post the module and we can use the same idea on Google Feeds (have done once in OC instance).
The problem here - I still need to create CRON jobs for different tasks. An @CZechBoY is right - what we would need, is a CRON engine, for tasks execution able to be set on server. What do you think?
There are XML tasks mostly imports/exports, which should be executed periodically. The problem, if you don't have an access to OS cron jobs.
@czechboy, can you send me your email?
This is about cron tasks, they are very necessary, I currently use in my projects, I can execute all registered tasks in the project and extensions through a single command, but the question here is about the feed xml for google, I think That generating xml cache manually by placing a button inside the module would not be a bad idea
Feeds changes constantly. If you add new product, xml must be regenerated. If customer buys product -feed must be regenerated. And so on.
@arnisjuraga why do you need my email?
Regenerate feed button is suitable for small catalogs, that is not subject of this issue.
Regenerate feed on each order is better but the regeneration time could be greater than time to next order. btw. how to start the regeneration when order is done? You probably dont want to let customer wait until feed (or more feeds) is regenerated...
Regenerate feed on each order is better but the regeneration time could be greater than time to next order. Agree. That's why the right solution would be to have a Cron job, whcich can be executed in any moment.
Cron job should be checked by small time interval, just don't run.
btw. how to start the regeneration when the order is done? after success order, add "flag" to the "cron" job, that site requires feeds regeneration. The same must be done on product Update, Edit, Add, Import etc. in admin. Just added "flag" for Feeds generation "true".
Nex time, when Cron is checked, it will see, that it requires Regeneration.
You probably don't want to let the customer wait until feed (or more feeds) is regenerated... Right. Nor customer, nor admin. Cron should be tried to execute on the server side. It really very depends on Hosting configuration. Not all of them allows PHP execution. Most of them have a timeout or drop_connection setting for long-running-php tasks. Something like this can help, if the server supports this:
https://stackoverflow.com/questions/2212635/best-way-to-manage-long-running-php-script
If it seems to be mostly supported, this can be a case. @prhost can introduce, what solution for crons are you using?
A bit off topic. The Goolge sitemap has an issue in that the products are listed 3 times with different URLs. This not only increases the generation time, but wastes valuable indexing time, as the other product URLs are fetched only to find the canonical link points to the first URL.
The following need removing. https://github.com/copona/copona/blob/master/catalog/controller/extension/feed/google_sitemap.php#L44-L53 https://github.com/copona/copona/blob/master/catalog/controller/extension/feed/google_sitemap.php#L92-L102
Also, the check for product image should be only be for the image tags. As it is at the moment if a product doesn't have an image it's missed from the sitemap. https://github.com/copona/copona/blob/master/catalog/controller/extension/feed/google_sitemap.php#L15
@ADDCreative Thanks, I am creating a new issue for this. I have fixed most of the problems in Google Feed on dev site, will put it online after few tests.