hass-aarlo icon indicating copy to clipboard operation
hass-aarlo copied to clipboard

Memory/Thread leak when unable to authenticate.

Open ynazar1 opened this issue 2 years ago • 22 comments

If Arlo integration is unable to auth it slowly leaks threads that never finish, eventually crashing home assistant.

unnamed

Running v0.7.2b8.3 version on HASSOS 8.5 Homeassistant: 2022.9.7 with following configs:

aarlo: username: !secret arlo.user password: !secret arlo.pass refresh_devices_every: 2 stream_timeout: 120 refresh_modes_every: 1 reconnect_every: 30 request_timeout: 120 packet_dump: false backend: sse tfa_source: push tfa_type: PUSH

Due to personal reasons my phone changed and there's an Arlo bug that prevents push notifications. As they are currently broken ARLO never completes authentication and leaks threads crashing home assistant after about 14 days due to out of memory exception.

ynazar1 avatar Oct 04 '22 04:10 ynazar1

I have the same problem :(

ProAdmin007 avatar Oct 12 '22 20:10 ProAdmin007

I have the same problem :(

@ProAdmin007 Arlo did fix the PUSH auth bug as of recently if you are using PUSH & 2FA and push notifications aren't coming in. You'll need to make sure you're on 3.7.1+ application version and then remove & add the phone as a trusted device and then push notifications will start showing up correctly. Since I've had same situation, this resolved Arlo not working for me... but the bug documented still exists and is reproducible.

ynazar1 avatar Oct 12 '22 20:10 ynazar1

Ah thank you. I will give it a try :)

ProAdmin007 avatar Oct 12 '22 21:10 ProAdmin007

I see you have a workaround but the latest 0.7 release should fix the issue.

twrecked avatar Oct 12 '22 23:10 twrecked

I'm having this issue as well. I'm currently on v0.7.4b18. You can see below the memory leak as soon as Arlo required a new 2FA approval and didn't get it. image

I BELIEVE this issue may have been happening for quite some time for me whenever HA briefly loses internet connectivity (but it could also be coincidental). The rate of the memory leak is the same as this time and I've never had luck tracing the problem otherwise.

brooksben11 avatar Aug 09 '23 02:08 brooksben11

@brooksben11 : Does the leak continue indefinitely until system crashes or does it give up eventually?

ynazar1 avatar Aug 09 '23 05:08 ynazar1

@brooksben11 : Does the leak continue indefinitely until system crashes or does it give up eventually?

I haven't let it run that long (although I've seen it climb to nearly 100%). I used to run HA on a system with a bit less memory and if this was the same issue, then yes, it would crash it.

brooksben11 avatar Aug 09 '23 12:08 brooksben11

@brooksben11 What sort of system are you running this on?

twrecked avatar Aug 09 '23 12:08 twrecked

@brooksben11 What sort of system are you running this on?

In the past when it would crash, it was a RPi 3B+ but I've been on an old laptop running HAOS on bare metal (Intel I3 with 4 gigs of ram).

brooksben11 avatar Aug 09 '23 13:08 brooksben11

Do you know where it's writing the logs?

twrecked avatar Aug 09 '23 14:08 twrecked

Do you know where it's writing the logs?

Not sure I follow. Like how to look at the logs?

brooksben11 avatar Aug 09 '23 17:08 brooksben11

I'm looking at if I'm leaking resources, the original code was broken and would create a lot of threads but that should be fixed now.

Are you using mqtt or sse? Or can you paste your config, with the sensitive data obscured obviously.

But I'm also wondering where the logs are being written, if it's to ram disk then that might account for the memory usage.

twrecked avatar Aug 09 '23 19:08 twrecked

I'm looking at if I'm leaking resources, the original code was broken and would create a lot of threads but that should be fixed now.

Are you using mqtt or sse? Or can you paste your config, with the sensitive data obscured obviously.

But I'm also wondering where the logs are being written, if it's to ram disk then that might account for the memory usage.

That I can do!

aarlo:
  username: !secret aarlo_username
  password: !secret aarlo_password
  refresh_devices_every: 2
  stream_timeout: 120
  tfa_source: push
  tfa_type: PUSH
  #tfa_retries: # optional: default 5; attempts to check for push approved
  #tfa_delay: # optional: default 5; seconds of delay between retries
  stream_snapshot: true
  backend: sse #added to get events to start showing up again

Obviously have the individual sensor, camera, etc. entities defined as well, but I assume you don't care about that?

BTW, thanks for all your work on this! I know Arlo doesn't make it easy, but you've allowed me to keep getting more useful life out of a less-than-ideal camera system.

brooksben11 avatar Aug 09 '23 19:08 brooksben11

@twrecked just had it happen again and quite confident now that the integration is the culprit, per this spamming the logs:

2023-08-11 11:21:15.992 ERROR (ArloEventStream) [pyaarlo] 2fa finishAuth failed
2023-08-11 11:21:21.252 ERROR (ArloEventStream) [pyaarlo] session start failed
2023-08-11 11:21:43.808 ERROR (ArloEventStream) [pyaarlo] 2fa finishAuth failed
2023-08-11 11:21:49.067 ERROR (ArloEventStream) [pyaarlo] session start failed
2023-08-11 11:22:11.598 ERROR (ArloEventStream) [pyaarlo] 2fa finishAuth failed
2023-08-11 11:22:16.859 ERROR (ArloEventStream) [pyaarlo] session start failed
2023-08-11 11:22:40.235 ERROR (ArloEventStream) [pyaarlo] 2fa finishAuth failed
2023-08-11 11:22:45.494 ERROR (ArloEventStream) [pyaarlo] session start failed
2023-08-11 11:23:08.033 ERROR (ArloEventStream) [pyaarlo] 2fa finishAuth failed
2023-08-11 11:23:13.298 ERROR (ArloEventStream) [pyaarlo] session start failed

image

The only reason it's starting to level out some is because it's started dumping it all to swap memory: image

Let me know if you need anything else. I've held off on updating to the latest version simply because of all the issues the last few months; figure leaving well-enough alone is best but am more than happy to upgrade if you'd like me to for testing.

Edit: Just got out my secondary phone that I use for Arlo 2FA approvals for this integration and when I logged in it immediately asked for approval. After doing so my Arlo entities came back online without me having to restart HA. Going to go ahead and restart anyways since my memory is still being wasted.

brooksben11 avatar Aug 11 '23 16:08 brooksben11

I'm sure it is the reason I just can't work out why at the moment. Python has garbage collection so I must have created a loop somewhere.

edit: I might have just seen it... can you turn debug on? I think it's stuck in a login loop.

twrecked avatar Aug 11 '23 17:08 twrecked

Will do. I've set pyaarlo and custom_components.aarlo to debug.

I can try resetting my router to see if it causes the issue (like I said before, this always causes a similar symptom but I've never really taken the time to try and hunt down the source).

brooksben11 avatar Aug 11 '23 17:08 brooksben11

That would be great, but only if you don't mind.

And can you tell me the path to the homeassistant.log file on the image.

twrecked avatar Aug 11 '23 18:08 twrecked

Not seeing a spike in memory after restarting my router, but I'm wondering if it's because I JUST re-authenticated my connection Arlo? I can try again in a few days to see if I can cause the issue to reoccur (or if you have any ideas of how to cause it).

I believe it's just in the config folder? I haven't done anything to change the defaults and I see a "home-assistant.log.1" file there. I should have a pretty vanilla HAOS install.

brooksben11 avatar Aug 11 '23 18:08 brooksben11

@brooksben11 It would be awesome if you could figure out what's eating memory once leak occurs. Hass Glances plugin can tell you what process is using memory on the system and you could try to thread dump it to see what's hold threads open or at least to narrow it down a bit more in addition to logs.

ynazar1 avatar Aug 11 '23 18:08 ynazar1

@brooksben11 It would be awesome if you could figure out what's eating memory once leak occurs. Hass Glances plugin can tell you what process is using memory on the system and you could try to thread dump it to see what's hold threads open or at least to narrow it down a bit more in addition to logs.

I actually already have the Glances Add-on installed so I can pull in CPU temp. I've honestly never used it beyond that single purpose. I'll try to remember and poke around when it happens again.

brooksben11 avatar Aug 11 '23 18:08 brooksben11

@twrecked Restarted my router this morning and am able to replicate the issue again. Here's what my logs look like (obviously just a snippet, but I THINK I grabbed everything that is repeating:

2023-08-13 09:52:18.833 WARNING (ArloEventStream) [pyaarlo] error in new response={'meta': {'code': 400, 'error': 9233, 'message': 'Authentication is not finished yet'}}
2023-08-13 09:52:18.834 WARNING (ArloEventStream) [pyaarlo] 2fa finishAuth - tries 1
2023-08-13 09:52:23.836 DEBUG (ArloEventStream) [pyaarlo] finishing auth
2023-08-13 09:52:24.089 WARNING (ArloEventStream) [pyaarlo] error in new response={'meta': {'code': 400, 'error': 9233, 'message': 'Authentication is not finished yet'}}
2023-08-13 09:52:24.092 WARNING (ArloEventStream) [pyaarlo] 2fa finishAuth - tries 2
2023-08-13 09:52:29.094 DEBUG (ArloEventStream) [pyaarlo] finishing auth
2023-08-13 09:52:29.351 WARNING (ArloEventStream) [pyaarlo] error in new response={'meta': {'code': 400, 'error': 9233, 'message': 'Authentication is not finished yet'}}
2023-08-13 09:52:29.353 WARNING (ArloEventStream) [pyaarlo] 2fa finishAuth - tries 3
2023-08-13 09:52:34.357 DEBUG (ArloEventStream) [pyaarlo] finishing auth
2023-08-13 09:52:34.991 WARNING (ArloEventStream) [pyaarlo] error in new response={'meta': {'code': 400, 'error': 9233, 'message': 'Authentication is not finished yet'}}
2023-08-13 09:52:34.994 WARNING (ArloEventStream) [pyaarlo] 2fa finishAuth - tries 4
2023-08-13 09:52:39.996 DEBUG (ArloEventStream) [pyaarlo] finishing auth
2023-08-13 09:52:40.307 WARNING (ArloEventStream) [pyaarlo] error in new response={'meta': {'code': 400, 'error': 9238, 'message': 'Maximum permitted time for authentication has lapsed'}}
2023-08-13 09:52:40.308 WARNING (ArloEventStream) [pyaarlo] 2fa finishAuth - tries 5
2023-08-13 09:52:40.310 ERROR (ArloEventStream) [pyaarlo] 2fa finishAuth failed
2023-08-13 09:52:45.313 DEBUG (ArloEventStream) [pyaarlo] re-logging in
2023-08-13 09:52:45.313 DEBUG (ArloEventStream) [pyaarlo] looking for user_agent arlo
2023-08-13 09:52:45.314 DEBUG (ArloEventStream) [pyaarlo] newish sessions, re-using
2023-08-13 09:52:45.575 ERROR (ArloEventStream) [pyaarlo] session start failed
2023-08-13 09:52:45.576 DEBUG (ArloEventStream) [pyaarlo] looking for user_agent arlo
2023-08-13 09:52:45.672 DEBUG (ArloEventStream) [pyaarlo] oldish session, getting a new one
2023-08-13 09:52:45.672 DEBUG (ArloEventStream) [pyaarlo] login attempt #1
2023-08-13 09:52:45.995 DEBUG (ArloEventStream) [pyaarlo] need 2FA...
2023-08-13 09:52:45.995 DEBUG (ArloEventStream) [pyaarlo] getting tfa choices
2023-08-13 09:52:46.291 DEBUG (ArloEventStream) [pyaarlo] looking for push
2023-08-13 09:52:46.291 DEBUG (ArloEventStream) [pyaarlo] starting auth with push
2023-08-13 09:52:46.830 DEBUG (ArloEventStream) [pyaarlo] finishing auth
2023-08-13 09:52:47.077 WARNING (ArloEventStream) [pyaarlo] error in new response={'meta': {'code': 400, 'error': 9233, 'message': 'Authentication is not finished yet'}}
2023-08-13 09:52:47.079 WARNING (ArloEventStream) [pyaarlo] 2fa finishAuth - tries 1
2023-08-13 09:52:52.080 DEBUG (ArloEventStream) [pyaarlo] finishing auth
2023-08-13 09:52:52.354 WARNING (ArloEventStream) [pyaarlo] error in new response={'meta': {'code': 400, 'error': 9233, 'message': 'Authentication is not finished yet'}}
2023-08-13 09:52:52.357 WARNING (ArloEventStream) [pyaarlo] 2fa finishAuth - tries 2
2023-08-13 09:52:54.466 DEBUG (ArloBackgroundWorker) [pyaarlo] pinging Living Room
2023-08-13 09:52:54.734 DEBUG (ArloBackgroundWorker) [pyaarlo] 4R036C7LA0FA0: set:ArloBase/4R036C7LA0FA0/connectionState=unavailable
2023-08-13 09:52:54.734 DEBUG (ArloBackgroundWorker) [custom_components.aarlo.binary_sensor] callback:Connected Living Room:connectionState:unavailable
2023-08-13 09:52:54.734 DEBUG (ArloBackgroundWorker) [pyaarlo] Living Room: connectionState: unavailable
2023-08-13 09:52:57.360 DEBUG (ArloEventStream) [pyaarlo] finishing auth
2023-08-13 09:52:57.629 WARNING (ArloEventStream) [pyaarlo] error in new response={'meta': {'code': 400, 'error': 9233, 'message': 'Authentication is not finished yet'}}
2023-08-13 09:52:57.630 WARNING (ArloEventStream) [pyaarlo] 2fa finishAuth - tries 3
2023-08-13 09:53:02.632 DEBUG (ArloEventStream) [pyaarlo] finishing auth
2023-08-13 09:53:02.899 WARNING (ArloEventStream) [pyaarlo] error in new response={'meta': {'code': 400, 'error': 9233, 'message': 'Authentication is not finished yet'}}
2023-08-13 09:53:02.901 WARNING (ArloEventStream) [pyaarlo] 2fa finishAuth - tries 4
2023-08-13 09:53:07.903 DEBUG (ArloEventStream) [pyaarlo] finishing auth
2023-08-13 09:53:08.192 WARNING (ArloEventStream) [pyaarlo] error in new response={'meta': {'code': 400, 'error': 9238, 'message': 'Maximum permitted time for authentication has lapsed'}}
2023-08-13 09:53:08.193 WARNING (ArloEventStream) [pyaarlo] 2fa finishAuth - tries 5
2023-08-13 09:53:08.195 ERROR (ArloEventStream) [pyaarlo] 2fa finishAuth failed
2023-08-13 09:53:13.198 DEBUG (ArloEventStream) [pyaarlo] re-logging in
2023-08-13 09:53:13.198 DEBUG (ArloEventStream) [pyaarlo] looking for user_agent arlo
2023-08-13 09:53:13.198 DEBUG (ArloEventStream) [pyaarlo] newish sessions, re-using
2023-08-13 09:53:13.474 ERROR (ArloEventStream) [pyaarlo] session start failed
2023-08-13 09:53:13.476 DEBUG (ArloEventStream) [pyaarlo] looking for user_agent arlo
2023-08-13 09:53:13.559 DEBUG (ArloEventStream) [pyaarlo] oldish session, getting a new one
2023-08-13 09:53:13.559 DEBUG (ArloEventStream) [pyaarlo] login attempt #1
2023-08-13 09:53:13.917 DEBUG (ArloEventStream) [pyaarlo] need 2FA...
2023-08-13 09:53:13.917 DEBUG (ArloEventStream) [pyaarlo] getting tfa choices
2023-08-13 09:53:14.215 DEBUG (ArloEventStream) [pyaarlo] looking for push
2023-08-13 09:53:14.215 DEBUG (ArloEventStream) [pyaarlo] starting auth with push
2023-08-13 09:53:14.709 DEBUG (ArloEventStream) [pyaarlo] finishing auth
2023-08-13 09:53:14.969 WARNING (ArloEventStream) [pyaarlo] error in new response={'meta': {'code': 400, 'error': 9233, 'message': 'Authentication is not finished yet'}}
2023-08-13 09:53:14.971 WARNING (ArloEventStream) [pyaarlo] 2fa finishAuth - tries 1
2023-08-13 09:53:19.973 DEBUG (ArloEventStream) [pyaarlo] finishing auth
2023-08-13 09:53:20.240 WARNING (ArloEventStream) [pyaarlo] error in new response={'meta': {'code': 400, 'error': 9233, 'message': 'Authentication is not finished yet'}}
2023-08-13 09:53:20.242 WARNING (ArloEventStream) [pyaarlo] 2fa finishAuth - tries 2
2023-08-13 09:53:25.244 DEBUG (ArloEventStream) [pyaarlo] finishing auth
2023-08-13 09:53:25.503 WARNING (ArloEventStream) [pyaarlo] error in new response={'meta': {'code': 400, 'error': 9233, 'message': 'Authentication is not finished yet'}}
2023-08-13 09:53:25.505 WARNING (ArloEventStream) [pyaarlo] 2fa finishAuth - tries 3
2023-08-13 09:53:30.507 DEBUG (ArloEventStream) [pyaarlo] finishing auth
2023-08-13 09:53:30.780 WARNING (ArloEventStream) [pyaarlo] error in new response={'meta': {'code': 400, 'error': 9233, 'message': 'Authentication is not finished yet'}}
2023-08-13 09:53:30.785 WARNING (ArloEventStream) [pyaarlo] 2fa finishAuth - tries 4
2023-08-13 09:53:35.788 DEBUG (ArloEventStream) [pyaarlo] finishing auth
2023-08-13 09:53:36.043 WARNING (ArloEventStream) [pyaarlo] error in new response={'meta': {'code': 400, 'error': 9238, 'message': 'Maximum permitted time for authentication has lapsed'}}
2023-08-13 09:53:36.045 WARNING (ArloEventStream) [pyaarlo] 2fa finishAuth - tries 5
2023-08-13 09:53:36.047 ERROR (ArloEventStream) [pyaarlo] 2fa finishAuth failed
2023-08-13 09:53:41.049 DEBUG (ArloEventStream) [pyaarlo] re-logging in
2023-08-13 09:53:41.049 DEBUG (ArloEventStream) [pyaarlo] looking for user_agent arlo
2023-08-13 09:53:41.049 DEBUG (ArloEventStream) [pyaarlo] newish sessions, re-using
2023-08-13 09:53:41.314 ERROR (ArloEventStream) [pyaarlo] session start failed
2023-08-13 09:53:41.315 DEBUG (ArloEventStream) [pyaarlo] looking for user_agent arlo
2023-08-13 09:53:41.384 DEBUG (ArloEventStream) [pyaarlo] oldish session, getting a new one
2023-08-13 09:53:41.384 DEBUG (ArloEventStream) [pyaarlo] login attempt #1
2023-08-13 09:53:42.080 DEBUG (ArloEventStream) [pyaarlo] need 2FA...
2023-08-13 09:53:42.080 DEBUG (ArloEventStream) [pyaarlo] getting tfa choices
2023-08-13 09:53:42.355 DEBUG (ArloEventStream) [pyaarlo] looking for push
2023-08-13 09:53:42.355 DEBUG (ArloEventStream) [pyaarlo] starting auth with push
2023-08-13 09:53:42.875 DEBUG (ArloEventStream) [pyaarlo] finishing auth
2023-08-13 09:53:43.139 WARNING (ArloEventStream) [pyaarlo] error in new response={'meta': {'code': 400, 'error': 9233, 'message': 'Authentication is not finished yet'}}
2023-08-13 09:53:43.140 WARNING (ArloEventStream) [pyaarlo] 2fa finishAuth - tries 1
2023-08-13 09:53:48.143 DEBUG (ArloEventStream) [pyaarlo] finishing auth
2023-08-13 09:53:48.789 WARNING (ArloEventStream) [pyaarlo] error in new response={'meta': {'code': 400, 'error': 9233, 'message': 'Authentication is not finished yet'}}
2023-08-13 09:53:48.791 WARNING (ArloEventStream) [pyaarlo] 2fa finishAuth - tries 2
2023-08-13 09:53:53.793 DEBUG (ArloEventStream) [pyaarlo] finishing auth
2023-08-13 09:53:54.060 WARNING (ArloEventStream) [pyaarlo] error in new response={'meta': {'code': 400, 'error': 9233, 'message': 'Authentication is not finished yet'}}
2023-08-13 09:53:54.062 WARNING (ArloEventStream) [pyaarlo] 2fa finishAuth - tries 3
2023-08-13 09:53:54.468 DEBUG (ArloBackgroundWorker) [pyaarlo] pinging Living Room
2023-08-13 09:53:54.736 DEBUG (ArloBackgroundWorker) [pyaarlo] 4R036C7LA0FA0: set:ArloBase/4R036C7LA0FA0/connectionState=unavailable
2023-08-13 09:53:54.736 DEBUG (ArloBackgroundWorker) [custom_components.aarlo.binary_sensor] callback:Connected Living Room:connectionState:unavailable
2023-08-13 09:53:54.736 DEBUG (ArloBackgroundWorker) [pyaarlo] Living Room: connectionState: unavailable
2023-08-13 09:53:59.064 DEBUG (ArloEventStream) [pyaarlo] finishing auth
2023-08-13 09:53:59.330 WARNING (ArloEventStream) [pyaarlo] error in new response={'meta': {'code': 400, 'error': 9233, 'message': 'Authentication is not finished yet'}}
2023-08-13 09:53:59.332 WARNING (ArloEventStream) [pyaarlo] 2fa finishAuth - tries 4
2023-08-13 09:54:04.334 DEBUG (ArloEventStream) [pyaarlo] finishing auth
2023-08-13 09:54:04.605 WARNING (ArloEventStream) [pyaarlo] error in new response={'meta': {'code': 400, 'error': 9238, 'message': 'Maximum permitted time for authentication has lapsed'}}
2023-08-13 09:54:04.607 WARNING (ArloEventStream) [pyaarlo] 2fa finishAuth - tries 5
2023-08-13 09:54:04.609 ERROR (ArloEventStream) [pyaarlo] 2fa finishAuth failed
2023-08-13 09:54:09.611 DEBUG (ArloEventStream) [pyaarlo] re-logging in
2023-08-13 09:54:09.611 DEBUG (ArloEventStream) [pyaarlo] looking for user_agent arlo
2023-08-13 09:54:09.612 DEBUG (ArloEventStream) [pyaarlo] newish sessions, re-using
2023-08-13 09:54:09.890 ERROR (ArloEventStream) [pyaarlo] session start failed
2023-08-13 09:54:09.893 DEBUG (ArloEventStream) [pyaarlo] looking for user_agent arlo
2023-08-13 09:54:09.974 DEBUG (ArloEventStream) [pyaarlo] oldish session, getting a new one
2023-08-13 09:54:09.974 DEBUG (ArloEventStream) [pyaarlo] login attempt #1
2023-08-13 09:54:10.659 DEBUG (ArloEventStream) [pyaarlo] need 2FA...
2023-08-13 09:54:10.659 DEBUG (ArloEventStream) [pyaarlo] getting tfa choices
2023-08-13 09:54:10.965 DEBUG (ArloEventStream) [pyaarlo] looking for push
2023-08-13 09:54:10.965 DEBUG (ArloEventStream) [pyaarlo] starting auth with push
2023-08-13 09:54:11.475 DEBUG (ArloEventStream) [pyaarlo] finishing auth
2023-08-13 09:54:11.739 WARNING (ArloEventStream) [pyaarlo] error in new response={'meta': {'code': 400, 'error': 9233, 'message': 'Authentication is not finished yet'}}
2023-08-13 09:54:11.741 WARNING (ArloEventStream) [pyaarlo] 2fa finishAuth - tries 1

Let me know if there's any other info you need while it's in this state. I'll leave it as-is for a while before restarting/re-authenticating.

brooksben11 avatar Aug 13 '23 15:08 brooksben11