SleekXMPP icon indicating copy to clipboard operation
SleekXMPP copied to clipboard

handler threads don't always exit

Open coffeeowl opened this issue 12 years ago • 7 comments

If connection with XMPP server was lost (even if it was clean server shutdown) event handling thread doesn't exit if reconnect attempt was made right after disconnect event. The problem is in while loop https://github.com/fritzy/SleekXMPP/blob/master/sleekxmpp/xmlstream/xmlstream.py#L1311

    try:
        while not self.stop.is_set():
            try:
                wait = self.wait_timeout
                event = self.event_queue.get(True, timeout=wait)
            except queue.Empty:
                event = None
            if event is None:
                continue

due to timeout, self.stop can be cleared before event_queue times out and when while makes new iteration self.stop is already cleared again.

DEBUG:sleekxmpp.xmlstream.xmlstream:!!! waiting for events DEBUG:sleekxmpp.xmlstream.xmlstream:!!! no events DEBUG:sleekxmpp.xmlstream.xmlstream:!!! waiting for events DEBUG:sleekxmpp.xmlstream.xmlstream:!!! no events DEBUG:sleekxmpp.xmlstream.xmlstream:!!! waiting for events ERROR:sleekxmpp.xmlstream.xmlstream:Error reading from XML stream. ERROR:sleekxmpp.basexmpp:no element found: line 1, column 734 Traceback (most recent call last): File "/usr/local/lib/python2.7/dist-packages/sleekxmpp/xmlstream/xmlstream.py", line 1159, in _process if not self.__read_xml(): File "/usr/local/lib/python2.7/dist-packages/sleekxmpp/xmlstream/xmlstream.py", line 1195, in __read_xml for event, xml in ET.iterparse(self.filesocket, (b'end', b'start')): File "", line 107, in next ParseError: no element found: line 1, column 734 DEBUG:sleekxmpp.xmlstream.xmlstream:SEND (IMMED): /stream:stream DEBUG:sleekxmpp.xmlstream.xmlstream:Waiting for /stream:stream from server DEBUG:sleekxmpp.xmlstream.xmlstream:!!! no events DEBUG:sleekxmpp.xmlstream.xmlstream:!!! waiting for events DEBUG:sleekxmpp.xmlstream.xmlstream:!!! no events DEBUG:sleekxmpp.xmlstream.xmlstream:!!! waiting for events DEBUG:sleekxmpp.xmlstream.xmlstream:!!! no events DEBUG:sleekxmpp.xmlstream.xmlstream:!!! waiting for events DEBUG:sleekxmpp.xmlstream.xmlstream:!!! auto_reconnect: False DEBUG:sleekxmpp.xmlstream.xmlstream:!!! stop set DEBUG:sleekxmpp.xmlstream.xmlstream:!!! stop - True DEBUG:sleekxmpp.thirdparty.statemachine: ==== TRANSITION connected -> disconnected DEBUG:sleekxmpp.xmlstream.xmlstream:!!!!!!! stop cleared !!!!!!!! DEBUG:sleekxmpp.xmlstream.xmlstream:Trying to connect to 127.0.0.1:5222 DEBUG:sleekxmpp.xmlstream.xmlstream:Waiting 2.01526779831 seconds before connecting. DEBUG:sleekxmpp.xmlstream.xmlstream:!!! no events DEBUG:sleekxmpp.xmlstream.xmlstream:!!! waiting for events DEBUG:sleekxmpp.xmlstream.xmlstream:!!! no events

Also this looks strange https://github.com/fritzy/SleekXMPP/blob/master/sleekxmpp/xmlstream/xmlstream.py#L1373

        while not self.stop.is_set():
            while not self.stop.is_set and \
                  not self.session_started_event.is_set():

is_set as a property?

coffeeowl avatar Mar 21 '12 10:03 coffeeowl

As for the strangeness with is_set, that has already been fixed in the develop branch.

legastero avatar Mar 21 '12 15:03 legastero

Would you mind sharing what your main app loop looks like with how you're handling connections/reconnects, etc.

It seems from your recent issues that it is just different enough from the way we write Sleek-based apps here that our testing process isn't catching these kinds of bugs.

legastero avatar Mar 21 '12 17:03 legastero

It is nothing special, simple loop with connect -> process:

while node.run:
    try:
        if node.connect(reattempt=False):
            node.process(threaded=False)
        else:
            raise Exception("connect() failed")
    except Exception, e:
        time.sleep(15)

remove_pid()
sys.exit(0)

Everything else before "while" is code for daemonization, privilege dropping, settings and signal handling and so on.

Maybe I should decrease upper bound for timeout in SleekXMPP and let it handle reconnects... or at least wait a second before calling connect() after process() exits. But anyway, I guess my code is legal, the functions are public and I call them in right order.

coffeeowl avatar Mar 22 '12 19:03 coffeeowl

That is certainly legal code that ought to work. As a temporary measure, adding in a two second delay between connection attempts should be plenty to ensure the threads detect the stop signal.

I'm currently investigating keeping a thread count and not letting disconnect() return until all threads have exited, which effectively would add that two second delay to the disconnect() call.

legastero avatar Mar 22 '12 22:03 legastero

Is there any progress on this issue?

duja avatar Apr 10 '12 12:04 duja

Develop branch now tracks the main threads and waits for them to exit before returning from disconnect().

legastero avatar Apr 23 '12 01:04 legastero

We have the problem that XMPPClient.disconnect() does not return to True as expected and other threads are still running in the background. Is there a possibility to forcefully stop these blocking threads?

ghost avatar Jun 12 '17 16:06 ghost