[Feature] Automatic update without cron job
I feel like it shouldn't be necessary to set up a cron job for automatic feed updating.
We already have the ability to perform a feed update by calling https://freshrss.example.net/i/?c=feed&a=actualize, and calling that URL every minute would perform a feed refresh at most once every 20 minutes.
Perhaps we could have an option to automatically call that URL every minute in the background whenever the FreshRSS page is open?
Yes, that would be possible, with some care about concurrent tasks. For now, it could be implemented as an extension
I wonder if this extension could be adapted: https://github.com/Eisa01/FreshRSS---Auto-Refresh-Extension
At the moment it simply reloads the page if there is no activity, but it could be altered to fetch https://freshrss.example.net/i/?c=feed&a=actualize instead. The activity monitor should be disabled as well, since background feed checks would not be disruptive.
Also, FreshRSS already does a check every two minutes to see if new articles have been fetched, via https://freshrss.example.net/freshrss/i/?c=javascript&a=nbUnreadsPerFeed . The feed fetching url could be called at the same time, since the hard-coded minimum of 20 minutes would still be in effect.
Hi there,
i had a similar idea, an just came about this thread.
However, I think the concept should be generalized:
- A cron job does trigger an action in regular time intervals.
- A FreshRSS installation is prompted with requests in random intervals from (possibly many) clients.
A 'client' may be a browser displaying the web interface, but also other sources like another RSS aggregator fetching a feed which is (re-)published by FreshRSS, or a smartphone app using one of the APIs.
-> What if we use the stochastic trigger from any request to check if a configurable time period has already elapsed, and if true, trigger the refresh action?
This would be something like a 'cron light' system. It would behave like if FreshRSS is triggered regularly by a cron job, as long as there is someone who is interested in it, regardless of the way accessing it.
I too, see this functionality to be implemented in an extension. Maybe I would be capable of doing it, although I would need some pointers.
Before diving into details, what are your thoughts? Do you see conceptual problems?
As I see it, this url simply needs to be triggered one per minute whenever the FreshRSS page is open: https://example.com/freshrss/i/?c=feed&a=actualize
But I do foresee one issue with replacing the cron job with this approach: fast changing feeds would not be fetched while the browser is closed. Meaning if the FreshRSS page was only accessed once per day, and a feed of 10 articles got 12 new articles in 24 hours, 2 articles would be "lost".
So it may well be the cron job is unavoidable.
Yes, this would be a downside. Here lies the difference between 'cheap', selfmade 'cron light', and a 'proper' real cron job.
But this limitation would also apply to your solution: When you close the browser, the url would not get triggered any more, and FrehsRSS would miss the 2 articles in the same way.
[Edit] You may also trigger the above url (or the URL intended for the cron job) by a device you own. Maybe your router is capable of doing this (RSS or DynDNS functionality)
Ok, I realized that we might be talking about very different scenarios:
-
Scenario 1: (I guess, this is triatic' ones) FRSS instance where...
- Cron job is set up and runs, lets say twice a day
- User has the web interface opened in a browser tab and is reading news.
- -> User might wish for some automatic 'extra' refreshing of all his feeds, lets say every 15 or 20 minutes while he/she leaves his/her browser window open
-
Scenario 2: (which I had in mind) FRSS instance where...
- No cron job set up
- FRSS fetches some feeds and filters them with user queries
- 'headless' use: Another feed aggregator (e.g app on PC or phone, or a standalone media player device) fetches the user query results via RSS export
- -> User wishes to have his feeds refreshed, whenever his secondary feed aggregator tries to fetch the RSS export from FRSS.
I read a lot of code the last days and have a working prototype now, that should satisfy both scenarios (see next post). I still got some questions.
I was initially suggesting to have no cron, but I forgot about background updates being needed to avoid missing articles while the browser is closed. There is no point in having the browser fetch articles when the cron is doing that already, so scenario 1 doesn't make much sense.
As I say, the way to achieve scenario 2 is to make an extension which triggers the fetch url once per minute whenever the FreshRSS page is open: https://example.com/freshrss/i/?c=feed&a=actualize
I was initially suggesting to have no cron, but I forgot about background updates being needed to avoid missing articles while the browser is closed.
I believe cPanel or some such offers a "web cron" (load a link periodically) without offering a "real" cron (run a local script periodically).
Indeed, "web cron" would be the alternative to a locally installed cron. But some form of cron is a necessity unless losing articles is of no concern.
It wouldn't be any worse than a traditional desktop app anyway, probably still slightly better (due to being able to access it on your phone). But yes, cron was once the primary reason for me to not just use e.g. QuiteRSS.
Ok, here is my first prototype. It's still very raw, writes files that grow infinitely and has no configuration interface (yet). It's not a release :) xExtension-AutoCron_v0.4.zip
The basic idea is the follwing:
class AutoCronExtension extends Minz_Extension {
private $runCronjob=false;
// will be called for every request made to FreshRSS
public function init() {
if(...enough time elapsed since last run...) {
$this->runCronjob=true;
// alternatively, see text below:
$this->registerHook('freshrss_init', [$this, 'liveUpdate']);
}
}
}
function __destruct() {
if ($this->runCronjob){
$callpath = FRESHRSS_PATH . "/app/actualize_script.php";
// non-blocking exec. Redirecting stdout and stderr streams tells PHP not to wait for the completion of the task.
shell_exec("php -f $callpath > /dev/null 2>/dev/null &");
}
}
}
So currently, I am simply firing the normal cronjob script in a separate process at the end of the script run. This has the advantage that the request is not delayed, so FRSS should run smoothly and the cronjob script runs in its own, clean environment. Also, I guess all concurrency problems were already solved when writing the cronjob script.
The downside is, that FRSS would not be up to date on the first request. In scenario 2, when I poll FRSS output with a single request, lets say once per hour, I would need to wait another hour until I see the updated content.
Now, my question is how to do a 'blocking' update, before processing the original request that we just intercepted?
-
my first guess would be to instantiate a
FreshRSS_feed_Controllerand call->actualizeAction()(/app/Controllers/feedController.php), but I guess that this may interfere with the original request, because it evaluates url parameters, which I cannot control, nor want to overwrite. -
my second guess would be writing a short adaption of the above method, calling:
- maybe?
FreshRSS_category_Controller::refreshDynamicOpmls(); FreshRSS_feed_Controller::actualizeFeeds($id, $url, $maxFeeds);FreshRSS_feed_Controller::commitNewEntries()
- maybe?
From a MVC point of view is this the right way to go? May there be / do you see any side effects, especially about interfering with the original request? I guess the right place to call this new routine would be the 'freshrss_init' Hook?
What I already realized is that refreshing will only be done for the current user, whereas the cronjob script will refresh everything for all users. So I would need to run the cronjob script afterwards anyway, but in an asynchronous, non-blocking way.
This is a job for a javascript timer on an endless loop, in my opinion.
Ok, but the javascript you mention is already there. As you said, as long as you have the web interface open, a request is fired every two minutes. This causes the extension to be executed every time, triggering the cron job in the given interval. I can see the requests in my logfile.
The advantage of my approach is, that it also covers to the other access ways (API, RSS, etc.)
If the extension gets called via https://example.com/freshrss/i/?c=javascript&a=nbUnreadsPerFeed then yes, that would work. When you talked about a scenario of polling once per hour I thought you were doing something else.
yes, it does get called. Well, I think about "polling once per hour", because I don't have the web interface open.
I think more about having dedicated hardware that consumes the feeds. (media players etc.) Easiest example would be a fritzbox which can consume RSS feeds as well as Podcasts (also RSS). These are likely 'always on' devices, so refreshing intervals will be rather reliable.
Also I do not really care about missing single episodes / articles.
Ah, and regarding web Cron: Yes, this is still another alternative. I would prefer not relying on another web service. If it's paid, you would be better better off paying a real webspace that offers cron natively. If it's not paid, it will likely be rate limited. Or you need to respond to an e-mail every half a year stating that you are still interested, or whatever.