core icon indicating copy to clipboard operation
core copied to clipboard

OctoPrint every 2 minutes unavailable

Open pestregen opened this issue 3 years ago • 12 comments

The problem

I'm using Octoprint Version 1.8.6 on an Android-TV-Box with Octo4a. On HA it's running with the auto-discovered integration.

The thing is, almost every 1-2 minutes changeing all entitys to unavailable and about 1-2 minutes later back to available. This problem goes since the beginning, after i installed the integration, 4 weeks ago (not only on core-2022.12.0, also on 2022.11.4 and .5). I tried to restart HA, restart the integration, restart Octo, nothing changed. Also i dont know its a problem of HA or Octo...

Anyone who knows whats going on, or have the same problem?

What version of Home Assistant Core has the issue?

core-2022.12.0

What was the last working version of Home Assistant Core?

No response

What type of installation are you running?

Home Assistant OS

Integration causing the issue

OctoPrint

Link to integration documentation on our website

https://www.home-assistant.io/integrations/octoprint/

Diagnostics information

No response

Example YAML snippet

No response

Anything in the logs that might be useful for us?

Logger: homeassistant.components.octoprint
Source: helpers/update_coordinator.py:168
Integration: OctoPrint (documentation, issues)
First occurred: 8. Dezember 2022 um 15:50:19 (335 occurrences)
Last logged: 08:20:47

Error requesting octoprint-*** data: Server disconnected
Error requesting octoprint-*** data: [Errno 104] Connection reset by peer

Additional information

No response

pestregen avatar Dec 09 '22 09:12 pestregen

Hey there @rfleming71, mind taking a look at this issue as it has been labeled with an integration (octoprint) you are listed as a code owner for? Thanks!

Code owner commands

Code owners of octoprint can trigger bot actions by commenting:

  • @home-assistant close Closes the issue.
  • @home-assistant rename Awesome new title Change the title of the issue.
  • @home-assistant reopen Reopen the issue.
  • @home-assistant unassign octoprint Removes the current integration label and assignees on the issue, add the integration domain after the command.

(message by CodeOwnersMention)


octoprint documentation octoprint source (message by IssueLinks)

home-assistant[bot] avatar Dec 09 '22 09:12 home-assistant[bot]

I am also seeing this on 2022.11.x sensors constantly flip between actual state and unavailable. Maybe the timeout needs to be increased? I especially notice this when I am actively printing something.

dshokouhi avatar Dec 09 '22 17:12 dshokouhi

I especially notice this when I am actively printing something.

Yep, flipping about every 2 minutes during printing. If octo is idling and not printing, there is no problem.

pestregen avatar Dec 20 '22 09:12 pestregen

I'm having the same issue as well (HA 2023.1.4). The disconnecting seemed to go away when I toggled "Enable polling for updates" OFF. However, as expected, nothing updated while printing (% complete, status, temps, etc).

jodysaliba avatar Jan 14 '23 19:01 jodysaliba

Same issue here, this also actually breaks automations, since the chance of the sensor values being "unavailable" when evaluating conditions and other templates for an automation is pretty high. Of course it would be possible to work around this using some template trickery, but fixing the issue is a preferred solution.

I found this issue, which might be related: https://github.com/fredrikbaberg/hassio-addons/issues/274

markusressel avatar Feb 26 '23 05:02 markusressel

I have the same issue for a while now. As a workaround I was already using the Home Assistant MQTT integration, and after installing and configuring these two plugins for OctoPrint, I have a stable reporting (though the items and names are a bit different):

adarazs avatar Feb 26 '23 12:02 adarazs

I tried to look into this, but getting the development environment to work seems to be quite a challenge, at least on Linux. I can't get the developing container to work at all, and I would rather want to work with it though pycharm. Does someone have a guide to set this up in pycharm maybe? The official docs only seem to mention VSCode.

markusressel avatar Mar 02 '23 00:03 markusressel

I have looked into the octoprint integration code a little, and from what I can see it sets a fixed update interval of 30 seconds. So the fact that it is becoming unavailable on a longer interval is strange.

I have checked out the pyoctoprintapi project and did a local test. This is the library that the octoprint home assistant integration is using for communicating with octoprint. Fetching the octoprint version every second (while the printer is not active) works without issues.

I also checked out the OctoRest project and did the same test, and it also works without issues.

I did not yet check what happens while the printer is active. When I have a print running I will try to redo these checks to find out if the issue might be on the rest client or octoprint side of things.

markusressel avatar Mar 05 '23 01:03 markusressel

Ok so I redid the test by using the VirtualPrinter plugin, and it seems like either OctoRest/pyoctoprintapi or the Octoprint API have problems responding to requests during printing:

Running OctoRest in a simple loop:

def main():
    c = make_client()
    while True:
        try:
            now = datetime.now()
            print(f"{now} - Version: {c.get_version()}")
            print(f"{now} - Printing: {c.printer()['state']['flags']['printing']}")
        except Exception as ex:
            print(f"Error: {ex}")
        finally:
            time.sleep(1)

gives the following output (I was switching between printing and paused state multiple times):

Error: ('Connection aborted.', ConnectionResetError(104, 'Connection reset by peer'))
2023-03-15 05:40:21.061831 - Version: {'api': '0.1', 'server': '1.8.7', 'text': 'OctoPrint 1.8.7'}
Error: ('Connection aborted.', ConnectionResetError(104, 'Connection reset by peer'))
2023-03-15 05:40:22.123158 - Version: {'api': '0.1', 'server': '1.8.7', 'text': 'OctoPrint 1.8.7'}
Error: ('Connection aborted.', ConnectionResetError(104, 'Connection reset by peer'))
2023-03-15 05:40:23.173642 - Version: {'api': '0.1', 'server': '1.8.7', 'text': 'OctoPrint 1.8.7'}
Error: ('Connection aborted.', ConnectionResetError(104, 'Connection reset by peer'))
2023-03-15 05:40:24.282325 - Version: {'api': '0.1', 'server': '1.8.7', 'text': 'OctoPrint 1.8.7'}
Error: ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response'))
2023-03-15 05:40:25.380805 - Version: {'api': '0.1', 'server': '1.8.7', 'text': 'OctoPrint 1.8.7'}
Error: ('Connection aborted.', ConnectionResetError(104, 'Connection reset by peer'))
2023-03-15 05:40:26.435626 - Version: {'api': '0.1', 'server': '1.8.7', 'text': 'OctoPrint 1.8.7'}
Error: ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response'))
2023-03-15 05:40:27.498021 - Version: {'api': '0.1', 'server': '1.8.7', 'text': 'OctoPrint 1.8.7'}
Error: ('Connection aborted.', ConnectionResetError(104, 'Connection reset by peer'))
2023-03-15 05:40:28.551656 - Version: {'api': '0.1', 'server': '1.8.7', 'text': 'OctoPrint 1.8.7'}
Error: ('Connection aborted.', ConnectionResetError(104, 'Connection reset by peer'))
2023-03-15 05:40:29.606055 - Version: {'api': '0.1', 'server': '1.8.7', 'text': 'OctoPrint 1.8.7'}
Error: ('Connection aborted.', ConnectionResetError(104, 'Connection reset by peer'))
2023-03-15 05:40:30.668801 - Version: {'api': '0.1', 'server': '1.8.7', 'text': 'OctoPrint 1.8.7'}
Error: ('Connection aborted.', ConnectionResetError(104, 'Connection reset by peer'))
2023-03-15 05:40:31.739888 - Version: {'api': '0.1', 'server': '1.8.7', 'text': 'OctoPrint 1.8.7'}
2023-03-15 05:40:31.739888 - Printing: True
2023-03-15 05:40:32.862748 - Version: {'api': '0.1', 'server': '1.8.7', 'text': 'OctoPrint 1.8.7'}
2023-03-15 05:40:32.862748 - Printing: False
2023-03-15 05:40:33.927432 - Version: {'api': '0.1', 'server': '1.8.7', 'text': 'OctoPrint 1.8.7'}
2023-03-15 05:40:33.927432 - Printing: False
2023-03-15 05:40:34.962198 - Version: {'api': '0.1', 'server': '1.8.7', 'text': 'OctoPrint 1.8.7'}
2023-03-15 05:40:34.962198 - Printing: False
2023-03-15 05:40:36.037761 - Version: {'api': '0.1', 'server': '1.8.7', 'text': 'OctoPrint 1.8.7'}
2023-03-15 05:40:36.037761 - Printing: False
2023-03-15 05:40:37.102435 - Version: {'api': '0.1', 'server': '1.8.7', 'text': 'OctoPrint 1.8.7'}
Error: ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response'))
2023-03-15 05:40:38.132847 - Version: {'api': '0.1', 'server': '1.8.7', 'text': 'OctoPrint 1.8.7'}
2023-03-15 05:40:38.132847 - Printing: False
2023-03-15 05:40:39.197125 - Version: {'api': '0.1', 'server': '1.8.7', 'text': 'OctoPrint 1.8.7'}
Error: ('Connection aborted.', ConnectionResetError(104, 'Connection reset by peer'))
2023-03-15 05:40:40.241001 - Version: {'api': '0.1', 'server': '1.8.7', 'text': 'OctoPrint 1.8.7'}
Error: ('Connection aborted.', ConnectionResetError(104, 'Connection reset by peer'))
2023-03-15 05:40:41.368768 - Version: {'api': '0.1', 'server': '1.8.7', 'text': 'OctoPrint 1.8.7'}
Error: ('Connection aborted.', ConnectionResetError(104, 'Connection reset by peer'))
2023-03-15 05:40:42.420958 - Version: {'api': '0.1', 'server': '1.8.7', 'text': 'OctoPrint 1.8.7'}
Error: ('Connection aborted.', ConnectionResetError(104, 'Connection reset by peer'))
2023-03-15 05:40:43.544009 - Version: {'api': '0.1', 'server': '1.8.7', 'text': 'OctoPrint 1.8.7'}
2023-03-15 05:40:43.544009 - Printing: True
2023-03-15 05:40:44.666430 - Version: {'api': '0.1', 'server': '1.8.7', 'text': 'OctoPrint 1.8.7'}
Error: ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response'))
2023-03-15 05:40:45.684632 - Version: {'api': '0.1', 'server': '1.8.7', 'text': 'OctoPrint 1.8.7'}
Error: ('Connection aborted.', ConnectionResetError(104, 'Connection reset by peer'))
2023-03-15 05:40:46.735010 - Version: {'api': '0.1', 'server': '1.8.7', 'text': 'OctoPrint 1.8.7'}
Error: ('Connection aborted.', ConnectionResetError(104, 'Connection reset by peer'))
2023-03-15 05:40:47.783154 - Version: {'api': '0.1', 'server': '1.8.7', 'text': 'OctoPrint 1.8.7'}
2023-03-15 05:40:47.783154 - Printing: True
2023-03-15 05:40:48.890040 - Version: {'api': '0.1', 'server': '1.8.7', 'text': 'OctoPrint 1.8.7'}
2023-03-15 05:40:48.890040 - Printing: False
2023-03-15 05:40:49.956715 - Version: {'api': '0.1', 'server': '1.8.7', 'text': 'OctoPrint 1.8.7'}
2023-03-15 05:40:49.956715 - Printing: False
2023-03-15 05:40:51.021132 - Version: {'api': '0.1', 'server': '1.8.7', 'text': 'OctoPrint 1.8.7'}
2023-03-15 05:40:51.021132 - Printing: False
2023-03-15 05:40:52.088551 - Version: {'api': '0.1', 'server': '1.8.7', 'text': 'OctoPrint 1.8.7'}
2023-03-15 05:40:52.088551 - Printing: False
2023-03-15 05:40:53.152045 - Version: {'api': '0.1', 'server': '1.8.7', 'text': 'OctoPrint 1.8.7'}
2023-03-15 05:40:53.152045 - Printing: False
2023-03-15 05:40:54.216271 - Version: {'api': '0.1', 'server': '1.8.7', 'text': 'OctoPrint 1.8.7'}
2023-03-15 05:40:54.216271 - Printing: False
...

which seems to indicate that even while "paused" the API sometimes cannot respond correctly, and even more so during printing.

Running the same test with pyoctoprintapi gives similar results:

async def main(host, user, port, use_ssl):
    async with aiohttp.ClientSession(cookie_jar=aiohttp.CookieJar(unsafe=True)) as websession:
        websession._default_headers = MappingProxyType({})  # type: ignore
        client = pyoctoprintapi.OctoprintClient(host, websession, port, use_ssl, "/")
        api_key = "XXX"
        client.set_api_key(api_key)

        while True:
            try:
                printer_info = await client.get_printer_info()
                job_info = await client.get_job_info()

                now = datetime.now()
                print(f"{now}: {printer_info.state.text} {job_info.progress.print_time}")
            except Exception as ex:
                print(f"Error: {ex}")
            finally:
                await asyncio.sleep(1)

        await websession.close()
Error: [Errno 104] Connection reset by peer
Error: [Errno 104] Connection reset by peer
Error: Server disconnected
Error: Server disconnected
Error: Server disconnected
Error: [Errno 104] Connection reset by peer
Error: Server disconnected
Error: [Errno 104] Connection reset by peer
Error: Server disconnected
Error: [Errno 104] Connection reset by peer
Error: Server disconnected
Error: [Errno 104] Connection reset by peer
2023-03-15 05:46:08.360371: Printing 771
2023-03-15 05:46:09.427906: Paused 771
2023-03-15 05:46:10.502426: Paused 771
2023-03-15 05:46:11.581161: Paused 771
2023-03-15 05:46:12.644165: Paused 771
2023-03-15 05:46:13.688133: Paused 771
2023-03-15 05:46:14.757835: Paused 771
2023-03-15 05:46:15.825112: Paused 771
2023-03-15 05:46:16.897657: Paused 771
Error: Server disconnected
Error: [Errno 104] Connection reset by peer
Error: [Errno 104] Connection reset by peer
Error: [Errno 104] Connection reset by peer
2023-03-15 05:46:22.313908: Printing 776
Error: [Errno 104] Connection reset by peer
Error: [Errno 104] Connection reset by peer
Error: [Errno 104] Connection reset by peer
2023-03-15 05:46:26.627862: Paused 780
2023-03-15 05:46:27.686636: Paused 780
2023-03-15 05:46:28.754240: Paused 780
2023-03-15 05:46:29.824086: Paused 780
2023-03-15 05:46:30.891531: Paused 780
2023-03-15 05:46:31.959175: Paused 780
2023-03-15 05:46:33.028373: Paused 780
2023-03-15 05:46:34.089405: Paused 780

One notable thing is that this seems to be caused by doing multiple requests in immediate succession too quickly, especially if the endpoints are different. F.ex. the following seems to work fine (at least most of the time):

...
printer_info = await client.get_printer_info()
await asyncio.sleep(0.1)
printer_info = await client.get_printer_info()
...
2023-03-15 05:52:48.059840: Printing
2023-03-15 05:52:49.210709: Printing
2023-03-15 05:52:50.411332: Printing
2023-03-15 05:52:51.616427: Printing
2023-03-15 05:52:52.806280: Printing
2023-03-15 05:52:53.990342: Printing
2023-03-15 05:52:55.184654: Printing
2023-03-15 05:52:56.396183: Printing
2023-03-15 05:52:57.631863: Printing
2023-03-15 05:52:58.919156: Printing
2023-03-15 05:53:00.150320: Printing
2023-03-15 05:53:01.432903: Printing
2023-03-15 05:53:02.833970: Printing
2023-03-15 05:53:04.084294: Printing
2023-03-15 05:53:05.296543: Printing
2023-03-15 05:53:06.504620: Printing
2023-03-15 05:53:07.746038: Printing
2023-03-15 05:53:09.068056: Printing
2023-03-15 05:53:10.288883: Printing
2023-03-15 05:53:11.611417: Printing
2023-03-15 05:53:12.819314: Printing
2023-03-15 05:53:14.036435: Printing
2023-03-15 05:53:15.244207: Printing

while this spits out errors almost 100% of the time, while printing:

...
printer_info = await client.get_printer_info()
printer_info = await client.get_printer_info()
...
2023-03-15 05:50:43.199914: Printing
2023-03-15 05:50:44.394441: Printing
Error: Server disconnected
Error: [Errno 104] Connection reset by peer
2023-03-15 05:50:47.882372: Printing
2023-03-15 05:50:49.209780: Printing
Error: Server disconnected
2023-03-15 05:50:51.470263: Pausing
2023-03-15 05:50:52.673247: Paused
2023-03-15 05:50:53.867497: Paused
2023-03-15 05:50:55.082839: Paused
2023-03-15 05:50:56.287068: Paused
2023-03-15 05:50:57.497185: Paused
Error: [Errno 104] Connection reset by peer
Error: [Errno 104] Connection reset by peer
Error: [Errno 104] Connection reset by peer
Error: [Errno 104] Connection reset by peer
Error: [Errno 104] Connection reset by peer
2023-03-15 05:51:04.151138: Printing
Error: [Errno 104] Connection reset by peer
Error: Server disconnected
2023-03-15 05:51:07.559168: Printing
2023-03-15 05:51:08.838187: Printing
2023-03-15 05:51:10.114703: Printing
2023-03-15 05:51:11.348302: Printing
Error: [Errno 104] Connection reset by peer
Error: Server disconnected
Error: [Errno 104] Connection reset by peer
2023-03-15 05:51:15.936176: Printing
2023-03-15 05:51:17.231630: Printing
Error: [Errno 104] Connection reset by peer
Error: Server disconnected
Error: [Errno 104] Connection reset by peer
2023-03-15 05:51:21.716704: Printing
Error: [Errno 104] Connection reset by peer
2023-03-15 05:51:24.022139: Printing
Error: Server disconnected
Error: Server disconnected
Error: [Errno 104] Connection reset by peer
Error: [Errno 104] Connection reset by peer
2023-03-15 05:51:29.464997: Printing
Error: [Errno 104] Connection reset by peer
Error: Server disconnected
Error: [Errno 104] Connection reset by peer
Error: [Errno 104] Connection reset by peer
Error: [Errno 104] Connection reset by peer
Error: Server disconnected

I am not sure if the issue is caused by the REST server implementation of Octoprint itself or by the client libraries, but if the Home Assistant integration queries multiple endpoints in succession (which is very probable) this might explain the occasional unavailability.

markusressel avatar Mar 15 '23 04:03 markusressel

It seems like the pyoctoprintapi library even already contains intentional delays to account for some issue, see: https://github.com/rfleming71/pyoctoprintapi/commit/57350e887b1d18d62026a7bac6f493c6a8f72692

However, using this delay is (in my testing) not enough to overcome the issue completely. Seeing this leads me to believe that the root cause for this issue is really located inside of the OctoPrint web API and not the client libraries.

markusressel avatar Mar 15 '23 05:03 markusressel

I was able to find the core issue causing the "unavailable" states, see: https://github.com/OctoPrint/OctoPrint/issues/4764

Looking at the code of Home Assistant, the issue could be mitigated by adding a force_close=True to the code in: https://github.com/home-assistant/core/blob/d7de23fa6506b9e0f21c914966e2f6d3d639f25b/homeassistant/helpers/aiohttp_client.py#L278

However, this might be undesirable since it can degrade performance for other applications that do not have the same issue. Therefore my recommendation for now would be to modify the OctoPrint Home Assistant integration to explicitly not use the session provided by Home Assistant, and create a custom session instead, since the methods provided by Home Assistant are not configurable enough to allow for this.

I have searched through the codebase but found no other place where a custom TCPConnector is used, so I would like to get some advice from the core maintainers about how to do this right, if it is even a possibility.

markusressel avatar Mar 20 '23 03:03 markusressel

Do not tag people.

balloob avatar Mar 20 '23 03:03 balloob

I am sooo looking forward to seeing this fix deployed! my logbook has so many of these entries causing noise that is is difficult to make sense of real events when looking at it

sle118 avatar May 03 '23 19:05 sle118

There hasn't been any activity on this issue recently. Due to the high number of incoming GitHub notifications, we have to clean some of the old issues, as many of them have already been resolved with the latest updates. Please make sure to update to the latest Home Assistant version and check if that solves the issue. Let us know if that works for you by adding a comment 👍 This issue has now been marked as stale and will be closed if no further activity occurs. Thank you for your contributions.

PR is still pending review, not fixed yet.

markusressel avatar Aug 01 '23 21:08 markusressel