intelmq icon indicating copy to clipboard operation
intelmq copied to clipboard

[5] Timeout for bots

Open sebix opened this issue 9 years ago • 7 comments

In another program but in a similar context we encountered a problem with long-running HTTP-requests. More specifically, in a request the data rate drops significantly sometimes to a very low rate. As a consequence the job is not finishing and blocks also other runs (the program was started regularly with cron).

This shows that we need something like a timeout for all kind of lookups/requests. This can be solved by spawning a thread which is to be killed after some time. But python does not support killing threads :( We can still kill the whole process and the init-system (e.g. systemd) automatically spawns a new process. Maybe there is a better solution?

Setting timeouts for a request itself is not useful in the case described above, data was still flowing. However we can set timeouts for creating the connection and the read. But these are very simple fixes.

sebix avatar Sep 06 '16 12:09 sebix

How do you handle a time out then?

aaronkaplan avatar Oct 05 '16 17:10 aaronkaplan

If this concerns the HTTP-Timeouts of the HTTP-Collectors, those can be set quite simple:

https://requests.readthedocs.io/en/master/user/quickstart/#timeouts https://requests.readthedocs.io/en/master/user/advanced/#timeouts

By default, requests do not time out unless a timeout value is set explicitly.

and

The timeout value will be applied to both the connect and the read timeouts. Specify a tuple if you would like to set the values separately:

dmth avatar Oct 06 '16 08:10 dmth

On 06 Sep 2016, at 14:11, Sebastian [email protected] wrote:

In another program but in a similar context we encountered a problem with long-running HTTP-requests. More specifically, in a request the data rate drops significantly sometimes to a very low rate. As a consequence the job is not finishing and blocks also other runs (the program was started regularly with cron).

This shows that we need something like a timeout for all kind of lookups/requests. This can be solved by spawning a thread which is to be killed after some time. But python does not support killing threads :( We can still kill the whole process and the init-system (e.g. systemd) automatically spawns a new process. Maybe there is a better solution?

Does not sound very nice. I prefer @dmth's approach.

And if data is flowing (but every slowly) it is not something we can really handle right now. What we can do is to record the events/sec rate in each bot and visualise that. This will at least allow the operator to determine the source of the performance problem.

My 2 cents, a.

aaronkaplan avatar Oct 06 '16 09:10 aaronkaplan

If this concerns the HTTP-Timeouts of the HTTP-Collectors, those can be set quite simple:

This is true, but not the case which I described above. If the rate is very low, the timeout does not come into effect. (This is scenario is not hypothetical)

sebix avatar Oct 17 '16 13:10 sebix

probably useful: https://wiki.python.org/moin/PythonDecoratorLibrary#Function_Timeout

sebix avatar Nov 10 '16 13:11 sebix

Working on it right now, 1st step: HTTP-Timeouts

dmth avatar Jan 23 '17 11:01 dmth

I learned today, that this issue ist already beeing worked on in: https://github.com/certtools/intelmq/pull/835

Nevertheless my results can be found here: https://github.com/Intevation/intelmq/tree/dev-collector-timeout

dmth avatar Jan 23 '17 14:01 dmth