openhab-addons icon indicating copy to clipboard operation
openhab-addons copied to clipboard

[netatmo] API limit reached handling

Open clinique opened this issue 11 months ago • 29 comments

Distinct timing (60 minutes) when API LIMIT REACHED has been issued.

Partially resolves issue #16485

clinique avatar Mar 05 '24 13:03 clinique

Wow, that was fast! Thanks for taking care. I will take a closer look at my logs, your fixes and also give it a run on my production system.

You were lucky enough that I had some spare time today - I did not launch eclipse since around 3 month.

clinique avatar Mar 05 '24 23:03 clinique

@jlaur : if you can test on this base. I've reached something that seems pretty strong on heavy workload and normally solved the ghost process issue.

clinique avatar Mar 16 '24 08:03 clinique

@jlaur : if you can test on this base. I've reached something that seems pretty strong on heavy workload and normally solved the ghost process issue.

Thanks! I've backported your changes to 4.1.x and just put the binding on my production system. It started without issues, but quickly after I noticed this:

2024-03-16 12:41:35.835 [WARN ] [.core.thing.binding.BaseThingHandler] - Handler ApiBridgeHandler of thing netatmo:account:home tried checking if channel monitoring#request-count is linked although the handler was already disposed.

and:

java.lang.ClassCastException: Cannot cast org.openhab.binding.netatmo.internal.handler.ModuleHandler to org.openhab.binding.netatmo.internal.handler.CommonInterface
	at java.lang.Class.cast(Class.java:3889) ~[?:?]
	at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:197) ~[?:?]
	at java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:179) ~[?:?]
	at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:197) ~[?:?]
	at java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:179) ~[?:?]
	at java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:179) ~[?:?]
	at java.util.AbstractList$RandomAccessSpliterator.forEachRemaining(AbstractList.java:720) ~[?:?]
	at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:509) ~[?:?]
	at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:499) ~[?:?]
	at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:575) ~[?:?]
	at java.util.stream.AbstractPipeline.evaluateToArrayNode(AbstractPipeline.java:260) ~[?:?]
	at java.util.stream.ReferencePipeline.toArray(ReferencePipeline.java:616) ~[?:?]
	at java.util.stream.ReferencePipeline.toArray(ReferencePipeline.java:622) ~[?:?]
	at java.util.stream.ReferencePipeline.toList(ReferencePipeline.java:627) ~[?:?]
	at org.openhab.binding.netatmo.internal.handler.CommonInterface.getActiveChildren(CommonInterface.java:156) ~[?:?]
	at org.openhab.binding.netatmo.internal.handler.CommonInterface.updateReadings(CommonInterface.java:213) ~[?:?]
	at org.openhab.binding.netatmo.internal.handler.CommonInterface.proceedWithUpdate(CommonInterface.java:207) ~[?:?]
	at org.openhab.binding.netatmo.internal.handler.capability.RefreshCapability.proceedWithUpdate(RefreshCapability.java:83) ~[?:?]
	at org.openhab.binding.netatmo.internal.handler.capability.RefreshCapability.lambda$3(RefreshCapability.java:124) ~[?:?]
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539) ~[?:?]
	at java.util.concurrent.FutureTask.run(FutureTask.java:264) ~[?:?]
	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304) ~[?:?]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) [?:?]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) [?:?]
	at java.lang.Thread.run(Thread.java:840) [?:?]

I'm not sure if the exception is related to the undisposed task from the previous version. In that case, it can be ignored.

I have now restarted openHAB and will let the binding run for some time, and I will also start analyzing your changes in relation to the experienced issues and see if I can figure out which changes addresses which issues. 🙂

jlaur avatar Mar 16 '24 11:03 jlaur

@clinique - strangely, if I grep "Module refreshed, next one in" /var/log/openhab/openhab.log, I see on only three minutes poll interval since yesterday when restarting the binding:

2024-03-18 22:28:30.940 [DEBUG] [handler.capability.RefreshCapability] - Module refreshed, next one in 180s
2024-03-18 22:28:31.065 [DEBUG] [handler.capability.RefreshCapability] - Module refreshed, next one in 180s
2024-03-18 22:31:33.954 [DEBUG] [handler.capability.RefreshCapability] - Module refreshed, next one in 180s
2024-03-18 22:31:34.200 [DEBUG] [handler.capability.RefreshCapability] - Module refreshed, next one in 180s
2024-03-18 22:34:34.491 [DEBUG] [handler.capability.RefreshCapability] - Module refreshed, next one in 180s
2024-03-18 22:34:34.664 [DEBUG] [handler.capability.RefreshCapability] - Module refreshed, next one in 180s

Also, I do not see anything logged regarding probing, i.e. "Data validity period identified to be" nor "Data validity period not yet found, data timestamp unchanged".

I cannot rule out that I made some mistake when backporting the changes, although I went over everything a few times. Do you get different results?

jlaur avatar Mar 18 '24 22:03 jlaur

@clinique - strangely, if I grep "Module refreshed, next one in" /var/log/openhab/openhab.log, I see on only three minutes poll interval since yesterday when restarting the binding:

2024-03-18 22:28:30.940 [DEBUG] [handler.capability.RefreshCapability] - Module refreshed, next one in 180s
2024-03-18 22:28:31.065 [DEBUG] [handler.capability.RefreshCapability] - Module refreshed, next one in 180s
2024-03-18 22:31:33.954 [DEBUG] [handler.capability.RefreshCapability] - Module refreshed, next one in 180s
2024-03-18 22:31:34.200 [DEBUG] [handler.capability.RefreshCapability] - Module refreshed, next one in 180s
2024-03-18 22:34:34.491 [DEBUG] [handler.capability.RefreshCapability] - Module refreshed, next one in 180s
2024-03-18 22:34:34.664 [DEBUG] [handler.capability.RefreshCapability] - Module refreshed, next one in 180s

Also, I do not see anything logged regarding probing, i.e. "Data validity period identified to be" nor "Data validity period not yet found, data timestamp unchanged".

I cannot rule out that I made some mistake when backporting the changes, although I went over everything a few times. Do you get different results?

Yes, I see the same. Looking at the thing definition, it has a refreshInterval configuration : image That's weird. Ok, got it.

clinique avatar Mar 18 '24 22:03 clinique

Yes, I see the same. Looking at the thing definition, it has a refreshInterval configuration : image That's weird.

I see you removed the default value of 180 seconds. But should it even have this configuration parameter? According to the documentation this only applies to the home Thing: https://www.openhab.org/addons/bindings/netatmo/#list-of-supported-things

jlaur avatar Mar 19 '24 07:03 jlaur

Yes, I see the same. Looking at the thing definition, it has a refreshInterval configuration : image That's weird.

I see you removed the default value of 180 seconds. But should it even have this configuration parameter? According to the documentation this only applies to the home Thing: https://www.openhab.org/addons/bindings/netatmo/#list-of-supported-things

Yes, it's only a quick fix. A change has been introduced here that created this issue. I'll take some to think of it and partially revert or correct PR #16492

clinique avatar Mar 19 '24 07:03 clinique

@jlaur : can you give a test or your feed-back on this ? I rebased with last merges.

clinique avatar Mar 23 '24 10:03 clinique

@jlaur : can you give a test or your feed-back on this ? I rebased with last merges.

Yes, I've had it running in production since Thursday after #16546 was merged. I'm currently running some tests after your latest commit today:

Additional module:

  • [x] Take out batteries and wait for it to go offline (when API returns "reachable": false). :green_circle:
  • [x] Insert batteries again and verify it goes back online. :green_circle:

Main module:

  • [x] Cut power and wait for it to go offline (when last timestamp is more than one hour ago). :green_circle:
  • [x] Power on again and verify it goes back online. :red_circle:

Restart binding:

  • [x] Verify number of calls (no disposed handlers still working). :green_circle:

Simulate API responses:

  • [x] OAuth authentication call failed. :red_circle:
  • [x] OAuth authentication call failed twice. :red_circle:
  • [x] Normal call failed. :red_circle:
  • [x] HTTP code 429. :green_circle:

Currently I'm checking the results of main module getting back online which doesn't seem to work.

For the last three tests I have provided a new console command for simulating exceptions and HTTP status codes.

I will keep you updated on the results.

jlaur avatar Mar 23 '24 14:03 jlaur

On first OAuth failure there are two status updates:

2024-03-23 16:42:16.408 [INFO ] [ab.event.ThingStatusInfoChangedEvent] - Thing 'netatmo:account:home' changed from ONLINE to OFFLINE (CONFIGURATION_ERROR): Complete the configuration by granting the binding to Netatmo Connect : `/netatmo/connect/xxx`.
2024-03-23 16:42:16.420 [INFO ] [ab.event.ThingStatusInfoChangedEvent] - Thing 'netatmo:account:home' changed from OFFLINE (CONFIGURATION_ERROR): Complete the configuration by granting the binding to Netatmo Connect : `/netatmo/connect/xxx`. to OFFLINE (COMMUNICATION_ERROR)

Retry is correctly scheduled:

2024-03-23 16:42:16.416 [DEBUG] [mo.internal.handler.ApiBridgeHandler] - Reconnection scheduled in 300 seconds

After second failure:

2024-03-23 16:47:16.470 [INFO ] [ab.event.ThingStatusInfoChangedEvent] - Thing 'netatmo:account:home' changed from OFFLINE (COMMUNICATION_ERROR) to OFFLINE (CONFIGURATION_ERROR): Complete the configuration by granting the binding to Netatmo Connect : `/netatmo/connect/xxx`.

and it doesn't recover after that.

jlaur avatar Mar 23 '24 15:03 jlaur

For the last three tests I have provided a new console command for simulating exceptions and HTTP status codes.

In case you are interested, see 62ae29d4a10caa5b13222d514c1b17dbda3a8df5.

jlaur avatar Mar 23 '24 16:03 jlaur

@clinique - I completed the tests for now and marked those failing with a red circle here: https://github.com/openhab/openhab-addons/pull/16489#issuecomment-2016507891

Let me know how you would like to proceed. Some of them might have also failed before this PR, so they may not all be regressions. However, I would need to all tests again, so just let me know when I should check (if you prefer not to fix all within the context of this PR).

jlaur avatar Mar 23 '24 20:03 jlaur

@clinique - I completed the tests for now and marked those failing with a red circle here: #16489 (comment)

Let me know how you would like to proceed. Some of them might have also failed before this PR, so they may not all be regressions. However, I would need to all tests again, so just let me know when I should check (if you prefer not to fix all within the context of this PR).

Thanks for your great help here. I'm going to try to address them sequentially and will ping you along the way.

clinique avatar Mar 24 '24 10:03 clinique

@clinique - I completed the tests for now and marked those failing with a red circle here: #16489 (comment)

Let me know how you would like to proceed. Some of them might have also failed before this PR, so they may not all be regressions. However, I would need to all tests again, so just let me know when I should check (if you prefer not to fix all within the context of this PR).

@jlaur : I think you're right. I'm going to split this PR in multiple. It adresses to much things and gets difficulte to understand.

clinique avatar Mar 24 '24 17:03 clinique

@clinique - it looks like you reverted too much with your latest force push. The support for HTTP status code 429 is no longer here. Perhaps you can also update the PR description to state exactly what is fixed.

jlaur avatar Mar 25 '24 15:03 jlaur

@clinique - there is also a conflict to resolve.

jlaur avatar Mar 25 '24 17:03 jlaur

Baseline tests, i.e. with PR's already merged today, but without changes from this PR:

Additional module:

  • [x] Take out batteries and wait for it to go offline (when API returns "reachable": false). :green_circle:
  • [x] Insert batteries again and verify it goes back online. :green_circle:

Main module:

  • [x] Cut power and wait for it to go offline (when last timestamp is more than one hour ago). :green_circle:
  • [x] Power on again and verify it goes back online. :green_circle:

Restart binding:

  • [x] Verify number of calls (no disposed handlers still working). :red_circle:

Simulate API responses:

  • [x] OAuth authentication call failed (should retry after max 15 minutes). :green_circle:
  • [x] OAuth authentication call failed twice (should retry after max 15 minutes). :green_circle:
  • [x] Normal call failed. :green_circle: (logged, but status not updated)
  • [x] HTTP code 429. :green_circle: (behaves like normal call failed)

Test notes:

On second OAuth2 failure:

2024-03-25 18:29:39.637 [WARN ] [etatmo.internal.servlet.GrantServlet] - Registering servlet failed:ServletModel{id=ServletModel-58,name='org.openhab.binding.netatmo.internal.servlet.GrantServlet',alias='/netatmo/connect/xxx',urlPatterns=[/netatmo/connect/xxx/*],servlet=org.openhab.binding.netatmo.internal.servlet.GrantServlet@1c10460,contexts=[{HS,OCM-54,default,/}]} can't be registered. Context / already contains servlet mapping for alias /netatmo/connect/xxx: ServletModel{id=ServletModel-56,name='org.openhab.binding.netatmo.internal.servlet.GrantServlet',alias='/netatmo/connect/xxx',urlPatterns=[/netatmo/connect/xxx/*],servlet=org.openhab.binding.netatmo.internal.servlet.GrantServlet@136f4ae,contexts=[{HS,OCM-54,default,/}]}

After playing around for some time with failed calls, too many calls:

2024-03-25 18:39:41.734 [DEBUG] [handler.capability.RefreshCapability] - netatmo:weather-station:home:indoor_small_bathroom refreshed, next one in 300s
2024-03-25 18:39:41.762 [DEBUG] [handler.capability.RefreshCapability] - netatmo:weather-station:home:indoor_small_bathroom refreshed, next one in 120s
2024-03-25 18:40:39.161 [DEBUG] [handler.capability.RefreshCapability] - netatmo:weather-station:home:indoor_small_bathroom refreshed, next one in 120s
2024-03-25 18:41:44.420 [DEBUG] [handler.capability.RefreshCapability] - netatmo:weather-station:home:indoor_small_bathroom refreshed, next one in 120s
2024-03-25 18:42:39.513 [DEBUG] [handler.capability.RefreshCapability] - netatmo:weather-station:home:indoor_small_bathroom refreshed, next one in 120s

Will run these tests again with this PR applied after conflict has been resolved.

jlaur avatar Mar 25 '24 17:03 jlaur

@clinique - there is also a conflict to resolve.

Ah, now I understand: This PR depends on #16574 now.

jlaur avatar Mar 25 '24 19:03 jlaur

Conflicts need to be resolved.

lolodomo avatar Mar 29 '24 08:03 lolodomo

Conflicts need to be resolved.

Yes, I have to rebase on behalf of all that has been merged and come back working on this PR now that the landscape is cleared.

clinique avatar Mar 29 '24 08:03 clinique

@lolodomo : you can now proceed with the review of this one.

clinique avatar Mar 29 '24 10:03 clinique

@clinique - why did you close this PR?

jlaur avatar Apr 01 '24 15:04 jlaur

@clinique - why did you close this PR?

I don't know. Probably a mistake, I made some cleaning in my repo...sorry.

clinique avatar Apr 01 '24 15:04 clinique

@clinique - I'm now back to this.

Can I ask how you tested this? I'm asking because I have applied my "console error simulator", and I immediately ran into an issue when being authenticated already, but then hitting an HTTP error code 429:

2024-04-01 21:11:11.736 [DEBUG] [tatmo.internal.handler.DeviceHandler] - bridgeStatusChanged for bridge netatmo:weather-station:home:indoor to OFFLINE (COMMUNICATION_ERROR): @text/maximum-usage-reached [ "3600" ]
2024-04-01 21:11:11.736 [DEBUG] [tatmo.internal.handler.DeviceHandler] - bridgeStatusChanged for bridge netatmo:weather-station:home:indoor_small_bathroom to OFFLINE (COMMUNICATION_ERROR): @text/maximum-usage-reached [ "3600" ]
2024-04-01 21:11:11.738 [DEBUG] [tatmo.internal.handler.ModuleHandler] - bridgeStatusChanged for thing netatmo:indoor:home:indoor:indoor3 to OFFLINE (BRIDGE_OFFLINE)
2024-04-01 21:11:11.740 [DEBUG] [tatmo.internal.handler.ModuleHandler] - bridgeStatusChanged for thing netatmo:indoor:home:indoor:indoor2 to OFFLINE (BRIDGE_OFFLINE)
2024-04-01 21:11:11.742 [DEBUG] [mo.internal.handler.ApiBridgeHandler] - Reconnection scheduled in 3600 seconds
2024-04-01 21:11:11.742 [DEBUG] [tatmo.internal.handler.ModuleHandler] - bridgeStatusChanged for thing netatmo:outdoor:home:indoor:outdoor to OFFLINE (BRIDGE_OFFLINE)
2024-04-01 21:11:11.743 [DEBUG] [tatmo.internal.handler.ModuleHandler] - bridgeStatusChanged for thing netatmo:indoor:home:indoor_small_bathroom:indoor_office to OFFLINE (BRIDGE_OFFLINE)
2024-04-01 21:11:11.743 [DEBUG] [tatmo.internal.handler.ModuleHandler] - bridgeStatusChanged for thing netatmo:indoor:home:indoor_small_bathroom:indoor_alva to OFFLINE (BRIDGE_OFFLINE)
2024-04-01 21:11:11.743 [DEBUG] [tatmo.internal.handler.ModuleHandler] - bridgeStatusChanged for thing netatmo:outdoor:home:indoor_small_bathroom:outdoor_shed to OFFLINE (BRIDGE_OFFLINE)
2024-04-01 21:11:11.743 [DEBUG] [tatmo.internal.handler.ModuleHandler] - bridgeStatusChanged for thing netatmo:indoor:home:indoor_small_bathroom:indoor_linus to OFFLINE (BRIDGE_OFFLINE)
2024-04-01 21:11:11.745 [WARN ] [handler.capability.WeatherCapability] - Error retrieving weather data '32': 
2024-04-01 21:11:11.748 [DEBUG] [handler.capability.RefreshCapability] - Thing 'netatmo:weather-station:home:indoor' is not ONLINE, using special refresh interval
2024-04-01 21:11:11.750 [DEBUG] [handler.capability.RefreshCapability] - 'netatmo:weather-station:home:indoor' next refresh in PT15M
2024-04-01 21:11:11.740 [INFO ] [ab.event.ThingStatusInfoChangedEvent] - Thing 'netatmo:account:home' changed from ONLINE to OFFLINE (COMMUNICATION_ERROR): Maximum usage reached, will reconnect in 3600 seconds.
2024-04-01 21:11:11.746 [INFO ] [ab.event.ThingStatusInfoChangedEvent] - Thing 'netatmo:weather-station:home:indoor_small_bathroom' changed from ONLINE to OFFLINE (BRIDGE_OFFLINE)
2024-04-01 21:11:11.748 [INFO ] [ab.event.ThingStatusInfoChangedEvent] - Thing 'netatmo:weather-station:home:indoor' changed from ONLINE to OFFLINE (BRIDGE_OFFLINE)
2024-04-01 21:11:11.752 [INFO ] [ab.event.ThingStatusInfoChangedEvent] - Thing 'netatmo:indoor:home:indoor:indoor3' changed from ONLINE to OFFLINE (BRIDGE_OFFLINE)
2024-04-01 21:11:11.753 [INFO ] [ab.event.ThingStatusInfoChangedEvent] - Thing 'netatmo:indoor:home:indoor:indoor2' changed from ONLINE to OFFLINE (BRIDGE_OFFLINE)
2024-04-01 21:11:11.754 [INFO ] [ab.event.ThingStatusInfoChangedEvent] - Thing 'netatmo:indoor:home:indoor_small_bathroom:indoor_office' changed from ONLINE to OFFLINE (BRIDGE_OFFLINE)
2024-04-01 21:11:11.755 [INFO ] [ab.event.ThingStatusInfoChangedEvent] - Thing 'netatmo:indoor:home:indoor_small_bathroom:indoor_alva' changed from ONLINE to OFFLINE (BRIDGE_OFFLINE)
2024-04-01 21:11:11.755 [INFO ] [ab.event.ThingStatusInfoChangedEvent] - Thing 'netatmo:indoor:home:indoor_small_bathroom:indoor_linus' changed from ONLINE to OFFLINE (BRIDGE_OFFLINE)
2024-04-01 21:11:11.756 [INFO ] [ab.event.ThingStatusInfoChangedEvent] - Thing 'netatmo:outdoor:home:indoor_small_bathroom:outdoor_shed' changed from ONLINE to OFFLINE (BRIDGE_OFFLINE)
2024-04-01 21:11:11.757 [INFO ] [ab.event.ThingStatusInfoChangedEvent] - Thing 'netatmo:outdoor:home:indoor:outdoor' changed from ONLINE to OFFLINE (BRIDGE_OFFLINE)
2024-04-01 21:26:11.751 [DEBUG] [l.handler.capability.CacheCapability] - MeasureCapability requesting fresh data for netatmo:weather-station:home:indoor

(and then everything came back online)

Then problem I see here is that we seem not to be protected at all from worsening after MAXIMUM_USAGE_REACHED or HTTP error code 429, since the child handlers are still permitted to call the API, and are actively doing so.

Looking at the code, I think the "API limiting" is only for the corner case of missing authentication or request errors?

Maybe I'm testing it the wrong way or misunderstood the intentions.

EDIT: It seems that after this occurred, API calls are still made periodically, but nothing is updated anymore. Example:

2024-04-01 21:45:47.840 [DEBUG] [l.handler.capability.CacheCapability] - MeasureCapability requesting fresh data for netatmo:weather-station:home:indoor_small_bathroom
2024-04-01 21:45:47.842 [DEBUG] [l.handler.capability.CacheCapability] - WeatherCapability requesting fresh data for netatmo:weather-station:home:indoor_small_bathroom
2024-04-01 21:45:47.843 [DEBUG] [mo.internal.handler.ApiBridgeHandler] - executeUri GET  https://api.netatmo.com/api/getstationsdata?device_id=46&get_favorites=false 
2024-04-01 21:45:47.850 [TRACE] [mo.internal.handler.ApiBridgeHandler] -  -with headers: Accept-Encoding: gzip, User-Agent: Jetty/9.4.52.v20230823, Authorization: Bearer xxx 
2024-04-01 21:45:48.166 [TRACE] [mo.internal.handler.ApiBridgeHandler] -  -returned: code [200 OK] body [...]
2024-04-01 21:45:48.170 [DEBUG] [l.handler.capability.CacheCapability] - MeasureCapability requesting fresh data for netatmo:indoor:home:indoor_small_bathroom:indoor_office
2024-04-01 21:45:48.172 [DEBUG] [l.handler.capability.CacheCapability] - MeasureCapability requesting fresh data for netatmo:outdoor:home:indoor_small_bathroom:outdoor_shed
2024-04-01 21:45:48.173 [DEBUG] [l.handler.capability.CacheCapability] - MeasureCapability requesting fresh data for netatmo:indoor:home:indoor_small_bathroom:indoor_alva
2024-04-01 21:45:48.174 [DEBUG] [l.handler.capability.CacheCapability] - MeasureCapability requesting fresh data for netatmo:indoor:home:indoor_small_bathroom:indoor_linus
2024-04-01 21:45:48.183 [DEBUG] [handler.capability.RefreshCapability] - 'netatmo:weather-station:home:indoor_small_bathroom' next refresh in PT10M4.816913174S

jlaur avatar Apr 01 '24 19:04 jlaur

@clinique - I'm now back to this.

Can I ask how you tested this? I'm asking because I have applied my "console error simulator", and I immediately ran into an issue when being authenticated already, but then hitting an HTTP error code 429:

2024-04-01 21:11:11.736 [DEBUG] [tatmo.internal.handler.DeviceHandler] - bridgeStatusChanged for bridge netatmo:weather-station:home:indoor to OFFLINE (COMMUNICATION_ERROR): @text/maximum-usage-reached [ "3600" ]
2024-04-01 21:11:11.736 [DEBUG] [tatmo.internal.handler.DeviceHandler] - bridgeStatusChanged for bridge netatmo:weather-station:home:indoor_small_bathroom to OFFLINE (COMMUNICATION_ERROR): @text/maximum-usage-reached [ "3600" ]
2024-04-01 21:11:11.738 [DEBUG] [tatmo.internal.handler.ModuleHandler] - bridgeStatusChanged for thing netatmo:indoor:home:indoor:indoor3 to OFFLINE (BRIDGE_OFFLINE)
2024-04-01 21:11:11.740 [DEBUG] [tatmo.internal.handler.ModuleHandler] - bridgeStatusChanged for thing netatmo:indoor:home:indoor:indoor2 to OFFLINE (BRIDGE_OFFLINE)
2024-04-01 21:11:11.742 [DEBUG] [mo.internal.handler.ApiBridgeHandler] - Reconnection scheduled in 3600 seconds
2024-04-01 21:11:11.742 [DEBUG] [tatmo.internal.handler.ModuleHandler] - bridgeStatusChanged for thing netatmo:outdoor:home:indoor:outdoor to OFFLINE (BRIDGE_OFFLINE)
2024-04-01 21:11:11.743 [DEBUG] [tatmo.internal.handler.ModuleHandler] - bridgeStatusChanged for thing netatmo:indoor:home:indoor_small_bathroom:indoor_office to OFFLINE (BRIDGE_OFFLINE)
2024-04-01 21:11:11.743 [DEBUG] [tatmo.internal.handler.ModuleHandler] - bridgeStatusChanged for thing netatmo:indoor:home:indoor_small_bathroom:indoor_alva to OFFLINE (BRIDGE_OFFLINE)
2024-04-01 21:11:11.743 [DEBUG] [tatmo.internal.handler.ModuleHandler] - bridgeStatusChanged for thing netatmo:outdoor:home:indoor_small_bathroom:outdoor_shed to OFFLINE (BRIDGE_OFFLINE)
2024-04-01 21:11:11.743 [DEBUG] [tatmo.internal.handler.ModuleHandler] - bridgeStatusChanged for thing netatmo:indoor:home:indoor_small_bathroom:indoor_linus to OFFLINE (BRIDGE_OFFLINE)
2024-04-01 21:11:11.745 [WARN ] [handler.capability.WeatherCapability] - Error retrieving weather data '32': 
2024-04-01 21:11:11.748 [DEBUG] [handler.capability.RefreshCapability] - Thing 'netatmo:weather-station:home:indoor' is not ONLINE, using special refresh interval
2024-04-01 21:11:11.750 [DEBUG] [handler.capability.RefreshCapability] - 'netatmo:weather-station:home:indoor' next refresh in PT15M
2024-04-01 21:11:11.740 [INFO ] [ab.event.ThingStatusInfoChangedEvent] - Thing 'netatmo:account:home' changed from ONLINE to OFFLINE (COMMUNICATION_ERROR): Maximum usage reached, will reconnect in 3600 seconds.
2024-04-01 21:11:11.746 [INFO ] [ab.event.ThingStatusInfoChangedEvent] - Thing 'netatmo:weather-station:home:indoor_small_bathroom' changed from ONLINE to OFFLINE (BRIDGE_OFFLINE)
2024-04-01 21:11:11.748 [INFO ] [ab.event.ThingStatusInfoChangedEvent] - Thing 'netatmo:weather-station:home:indoor' changed from ONLINE to OFFLINE (BRIDGE_OFFLINE)
2024-04-01 21:11:11.752 [INFO ] [ab.event.ThingStatusInfoChangedEvent] - Thing 'netatmo:indoor:home:indoor:indoor3' changed from ONLINE to OFFLINE (BRIDGE_OFFLINE)
2024-04-01 21:11:11.753 [INFO ] [ab.event.ThingStatusInfoChangedEvent] - Thing 'netatmo:indoor:home:indoor:indoor2' changed from ONLINE to OFFLINE (BRIDGE_OFFLINE)
2024-04-01 21:11:11.754 [INFO ] [ab.event.ThingStatusInfoChangedEvent] - Thing 'netatmo:indoor:home:indoor_small_bathroom:indoor_office' changed from ONLINE to OFFLINE (BRIDGE_OFFLINE)
2024-04-01 21:11:11.755 [INFO ] [ab.event.ThingStatusInfoChangedEvent] - Thing 'netatmo:indoor:home:indoor_small_bathroom:indoor_alva' changed from ONLINE to OFFLINE (BRIDGE_OFFLINE)
2024-04-01 21:11:11.755 [INFO ] [ab.event.ThingStatusInfoChangedEvent] - Thing 'netatmo:indoor:home:indoor_small_bathroom:indoor_linus' changed from ONLINE to OFFLINE (BRIDGE_OFFLINE)
2024-04-01 21:11:11.756 [INFO ] [ab.event.ThingStatusInfoChangedEvent] - Thing 'netatmo:outdoor:home:indoor_small_bathroom:outdoor_shed' changed from ONLINE to OFFLINE (BRIDGE_OFFLINE)
2024-04-01 21:11:11.757 [INFO ] [ab.event.ThingStatusInfoChangedEvent] - Thing 'netatmo:outdoor:home:indoor:outdoor' changed from ONLINE to OFFLINE (BRIDGE_OFFLINE)
2024-04-01 21:26:11.751 [DEBUG] [l.handler.capability.CacheCapability] - MeasureCapability requesting fresh data for netatmo:weather-station:home:indoor

(and then everything came back online)

Then problem I see here is that we seem not to be protected at all from worsening after MAXIMUM_USAGE_REACHED or HTTP error code 429, since the child handlers are still permitted to call the API, and are actively doing so.

Looking at the code, I think the "API limiting" is only for the corner case of missing authentication or request errors?

Maybe I'm testing it the wrong way or misunderstood the intentions.

EDIT: It seems that after this occurred, API calls are still made periodically, but nothing is updated anymore. Example:

2024-04-01 21:45:47.840 [DEBUG] [l.handler.capability.CacheCapability] - MeasureCapability requesting fresh data for netatmo:weather-station:home:indoor_small_bathroom
2024-04-01 21:45:47.842 [DEBUG] [l.handler.capability.CacheCapability] - WeatherCapability requesting fresh data for netatmo:weather-station:home:indoor_small_bathroom
2024-04-01 21:45:47.843 [DEBUG] [mo.internal.handler.ApiBridgeHandler] - executeUri GET  https://api.netatmo.com/api/getstationsdata?device_id=46&get_favorites=false 
2024-04-01 21:45:47.850 [TRACE] [mo.internal.handler.ApiBridgeHandler] -  -with headers: Accept-Encoding: gzip, User-Agent: Jetty/9.4.52.v20230823, Authorization: Bearer xxx 
2024-04-01 21:45:48.166 [TRACE] [mo.internal.handler.ApiBridgeHandler] -  -returned: code [200 OK] body [...]
2024-04-01 21:45:48.170 [DEBUG] [l.handler.capability.CacheCapability] - MeasureCapability requesting fresh data for netatmo:indoor:home:indoor_small_bathroom:indoor_office
2024-04-01 21:45:48.172 [DEBUG] [l.handler.capability.CacheCapability] - MeasureCapability requesting fresh data for netatmo:outdoor:home:indoor_small_bathroom:outdoor_shed
2024-04-01 21:45:48.173 [DEBUG] [l.handler.capability.CacheCapability] - MeasureCapability requesting fresh data for netatmo:indoor:home:indoor_small_bathroom:indoor_alva
2024-04-01 21:45:48.174 [DEBUG] [l.handler.capability.CacheCapability] - MeasureCapability requesting fresh data for netatmo:indoor:home:indoor_small_bathroom:indoor_linus
2024-04-01 21:45:48.183 [DEBUG] [handler.capability.RefreshCapability] - 'netatmo:weather-station:home:indoor_small_bathroom' next refresh in PT10M4.816913174S

I'll be back on this one very soon. I'm currently focused on another PR I had open since a while.

clinique avatar Apr 02 '24 16:04 clinique

@clinique @jlaur : What is the status here ? Is it work on progress or finished and ready for a review?

lolodomo avatar Apr 19 '24 07:04 lolodomo

@clinique @jlaur : What is the status here ? Is it work on progress or finished and ready for a review?

Unfinished - I have to get back on it

clinique avatar Apr 19 '24 07:04 clinique