liferea Limit/throttle updates

Limit/throttle updates

Open mgorny opened this issue 4 years ago • 12 comments

I'm following quite a lot of GitHub release feeds with Liferea. When it's updating, near the end of the list GitHub starts rejecting requests and the last feeds error out.

What I'd really use here is an option to limit the number of updates over time. I think it'd sufficient to have Liferea stop after doing M requests, and wait T time before continuing.

Apr 25 '20 06:04 mgorny

In the past we had a update concurrency setting (back then we had independend threads). Nowadays it is just the Glib event loop with a fixed number of concurrently processed updates. Currently the hard-coded limit is 5.

Maybe it is worth to provide this as a dconf setting again so people can overrule it.

Of course this does not help in your case because you are fetching from the same server. So I think the solution goes more in the direction of daily updates and never causing a manual global update.

Other then that we'd have to implement a per-server back-off mechanism which I'd like to avoid.

May 12 '20 18:05 lwindolf

I've just hard-changed it to 1 and it indeed slowed updates down sufficiently not to have them fail. So yes, I suppose that would be a good enough option for the time being.

May 14 '20 05:05 mgorny

I was thinking of something similar when updates are running as a background task. If the user asked for an update, then it would do it right away. For the problem use case, it would be desirable to rate limit a particular host because of access limits, no matter if it was batch or an user request.

May 14 '20 12:05 rich-coe

Well, the limit of 1 stopped working for me once I reached >~480 feeds.

Jun 20 '20 06:06 mgorny

@mgorny I think for your use-case we need a per-server request counter actively delaying updates to a certain request/sec amount.

Jun 22 '20 09:06 lwindolf

Probably. However, I think for the time being #845 would be helpful enough (being able to select multiple items to refresh these that failed).

Jun 22 '20 09:06 mgorny

The "per site" approach wouldn't really solve the problem... because in my case, with 270 feeds, if I trigger the "Update all feeds" action from Liferea on my laptop when I'm at my parents' house, it causes the local wifi router to crash (or maybe it's the wifi network card driver that crashes) and I have to turn the wifi off and on again :) if it was doing only a couple feed updates per second max, I don't think it could crash this equipment.

In my humble opinion, I think that having the app default to throttling itself to one feed per 100-250ms would probably be sufficient to solve most (if not all) issues. At one feed per 250ms that would mean 1 minute for me, or 2 minutes for @mgorny which I think is reasonable, but perhaps 100ms would make his life easier (~48 secs to update).

On my laptop, the reason why I trigger "Update all feeds" manually instead of "on startup" is precisely because I want to be able to choose the right moment to do it, when I am on a somewhat reliable network and not behind some captive portal or hostile AP. So the irony of me crashing the network because I triggered a feed update is not lost on me!

Oct 30 '20 22:10 nekohayo

I'm not sure if I haven't mentioned it originally or this is something that changed recently but GitHub is now issuing 429 HTTP error (Too Many Requests), so it should be possible to easily distinguish this problem from other update failures.

In particular, it would be sufficient for me if Liferea handled this error specially and delayed further updates from the server for a few seconds.

Nov 05 '20 11:11 mgorny

@mgorny Good idea. I've actually seen this on other sites too that I requested a bit enthusiastically during feed fetch tests.

So keeping a hashtable of (domains, timestamp) tuples along with some cooldown interval could be a simple solution. Although an exponential backoff would be nicer...

And having a HTTP request budget per minute as described by @nekohayo sounds also good. There could be preference for the number of network requests per minute. And the budget would be applied for automatic/startup/background updates only. User requested updates would always run.

Dec 16 '20 23:12 lwindolf

[...] And the budget would be applied for automatic/startup/background updates only. User requested updates would always run.

Not sure why you would want user-action-triggered global updates to not respect the throttle setting, I'd say both startup-auto-update and user-triggered global update should conform to this... My laptop's instance of Liferea is configured to not update on startup (because with a laptop, the app wouldn't know if I'm on a trusted and reliable network or not), but when I ask it to update, if not throttled, it can crash the router/wifi, even if that's not "on startup" :)

Even when I manually trigger an update, I don't necessarily need it to be "as fast as technologically possible", if it's "fast enough" but I have progress indication (issue #810)...

Jan 03 '21 03:01 nekohayo

@nekohayo This feature was one of the most requested features early on (1.4.x) as the perceived lag of the update was too long. With a larger number of feeds (especially with extra features enabled as content extraction) right after startup you'd need to wait for dozens of seconds before your manually triggered update happens. This by definition (5s rule) is bad UX.

Jan 03 '21 12:01 lwindolf

For the record, I've managed to switch the vast majority of my feeds to pypi (now that it publishes release feeds), and it's much better than GitHub. I've got almost 1500 feeds now, they update really fast and there's no silly limits like with GitHub. So in the end, the problem may be in GitHub's bad RSS/ATOM implementation.

Jan 03 '21 12:01 mgorny

liferea liferea copied to clipboard

Limit/throttle updates

liferea
liferea copied to clipboard