squid Bug 4947: EventScheduler loops on same task forever

The event scheduler itself does not update the time variables, which in some conditions may cause it to loop over the same task forever, and the task itself may never complete and simply abort and reschedule itself if it also does not update the time variables and is waiting for a certain timeout.

The fix adds a call to "getCurrentTime()" when the event scheduler checks events in the queue to avoid this condition.

May 08 '19 08:05 craiggowing

Can one of the admins verify this patch?

May 08 '19 08:05 squid-prbot

Ultimately we worked around the problem be changing the config and just avoiding setting dns_timeout to 0, but my concern was if there might be other cases where this kind of looping would occur and if it would be possible to prevent it in the EventScheduler code.

If it is the responsibility of the zero-delay jobs to handle this correctly then it seems reasonable that it should be fixed on a case-by-case basis, and the fix should just apply to the idnsCheckQueue code. I don't see any reason why this particular task would ever need to be zero-delay, so my first thought was to impose a minimum delay and enforce it either when validating the configuration (do not accept 0 seconds), and/or in idnsTickleQueue when the task is scheduled.

May 09 '19 07:05 craiggowing

my concern was if there might be other cases where this kind of looping would occur

Your concern is valid, and your work on addressing that concern (to the extent possible/reasonable) is appreciated. I hope that nothing in my previous response implied otherwise.

If it is the responsibility of the zero-delay jobs to handle this correctly

I believe those callers/jobs is the only location where this problem can be prevented. Event queue modifications, like the workaround you have proposed, cannot prevent some infinite event loops. Ideally, it would be nice to detect (and complain about/break) such infinite event loops in the event queue code, but doing so is probably very difficult, and the actual fix of the detected/salvaged problem would still belong to the eventAdd() caller!

then it seems reasonable that it should be fixed on a case-by-case basis, and the fix should just apply to the idnsCheckQueue code. I don't see any reason why this particular task would ever need to be zero-delay, so my first thought was to impose a minimum delay and enforce it either when validating the configuration (do not accept 0 seconds), and/or in idnsTickleQueue when the task is scheduled.

I agree in principle.

In many cases, it is possible to assign some reasonable meaning to zero delays (which would be treated specially in the code, to support that documented meaning). In this particular case, we could consider something like minimal non-blocking delay roughly equivalent to one main loop iteration[1]. Directive documentation aside, automatically increasing configured zero delays to a minimum non-zero delay[2] is probably the right implementation approach. I would not even warn about this increase -- a directive documentation update would be sufficient.

[1] The zero-delay meaning currently implemented in the broken official code is, essentially, not "one main loop iteration delay" (i.e., "process during the next main loop iteration") but "process during the same loop iteration", which causes the infinite event loop you have discovered.

[2] Consider using std::nextafter(0.0, 1.0) for that minimum increase/delay.

May 09 '19 14:05 rousskov

[2] Consider using std::nextafter(0.0, 1.0) for that minimum increase/delay.

Actually, using std::nexttoward(0, 1) may be slightly better because it avoids spelling out unnecessary details (i.e., the double type of arguments) and uses simpler integral types. See docs.

May 09 '19 15:05 rousskov

Anyone still working on this?

Nov 04 '19 11:11 yadij

squid squid copied to clipboard

Bug 4947: EventScheduler loops on same task forever

squid
squid copied to clipboard