architecture Rate limiting in scripts via condition clause

trafficstars

Context

Rate-limiting is especially useful, and it requires global state which makes it hard to do in isolation in the yaml configuration. It is particularly useful for notifications, and this (now closed) PR is an example of the motivation:

https://github.com/home-assistant/architecture/issues/378

@balloob proposed generalizing this, and I'm open to that.

Proposal

We introduce a rate-limiting condition that evaluates to true iff the rate limit is not exceeded, and we allow a tag to indicate which rate-limit queue to key off of. E.g.,

condition: rate_limit
tag: camera_notify # scoped to condition tags only
max_per_minute: 1
max_per_hour: 10
max_per_day: 50
max_per_time: 7
over_time: '0:15'

At least one of max_per_{minute,hour,day,time} is required. over_time must be specified iff max_per is specified. That allows a custom time interval.

Any of the rate limits being broken means the condition evaluates to false.

If "tag" is omitted then it's a unique rate_limit queue for this instance. (Alternative: we could require "tag".)

These conditions work inside triggers and inside scripts (and anywhere else conditions are supported). They are subject to the normal short-circuiting rules, etc., and the rate-limit queue is only up'ped when the condition is tested.

Consequences

There is great rejoicing because we can now rate-limit notifications and other actions.

Alternative designs

Could use:

condition: rate_limit
tag: camera_notify
limits:
   - max: 1
     over: '0:01'
   - max: 10
     over: '1:00'
   - max: 50
     over: '24:00'
   - max: 7
     over: '0:15'

instead of having the baked-in max_per_{minute,hour,day} options. I favor folding out the common case and supporting a single custom option. We could have the common cases and the fully general list of limits, but that seems like overkill.

May 30 '20 00:05 gjbadros

I like the simple design. I think that we should just stick with per_time.

condition: rate_limit
over_time: '0:01'
tag: bla # optional

May 30 '20 21:05 balloob

You mean just a single per_time? Or "over_time" like you used in your example? With "max" as the other config option?

With short-circuiting sequential-evaluation semantics, I don't think these compose properly which is why I propose the 4 cases folded into a single rate-limiting-gate. E.g., if you did this:

condition: and
  conditions:
     - condition: rate-limit
       max: 5
       over_time: '0:01'
     - condition: rate-limit
       max: 10
       over_time: '1:00'
etc.

You get the wrong behaviour, where a blocking at the hour duration was still queued up and counted as an action at the minute duration even though the overall condition failed.

May 31 '20 00:05 gjbadros

Okay, I propose just:

condition: rate_limit
tag: camera_notify
limits:
   - max: 1
     per: second
   - max: 10
     per: minute
   - max: 50
     per: day
   - max: 7
     per: '0:15'

Where per is just a time duration either as a HH:MM string, a number of seconds as an integer, or one of second, minute, hour, day. tag is as described before

I will document the issues with chaining them as separate conditionals, and note that they do not survive restarts (which is why I don't think supporting "month" and "year" matters).

Any final feedback before I implement?

Jun 05 '20 04:06 gjbadros

What is the value for tag?

I don't think we should mix different kinds of values for a key like is done for per in the latest suggestion. Separate time period from units.

Jun 05 '20 06:06 MartinHjelmare

Thanks for the questions, Martin!

First the easier part... the day/minute/hour are just constants meaning '24:00', ':01', '1:00', etc. While they are units, they're an abbreviation for "1 day", etc., as a means of encouraging canonical periods. Is there a standard abstraction for a time duration? The state condition's for seems a bit heavyweight for this but maybe there are others that are parsed as a dictionary value? Ideally it'd have unit names localized, too.

Re: the tag -- that's the unique key that ties different uses of condition: rate_limit together. So if you have five rules that trigger a notification of a certain logical type, all five of those rules have the final guard be a condition: rate_limit on the same unique tag.

One complexity that I thought more about overnight in contemplating the new implementation is what to do with different rate-limits at different tag locations. For example:

# script: !include scripts.yaml
script_a:
  sequence:
  - condition: rate-limit
    tag: ABClimits
    limits:
      - max: 10
        per: hour
  - service: ....
     ...

script_b:
  sequence:
    condition: rate-limit
    tag: ABClimits
    limits:
      - max: 24
        per: day
  - service: ....
     ...

script_c:
  sequence:
    condition: rate-limit
    tag: ABClimits
    limits:
      - max: 5
        per: hour
  - service: ....
     ...

The ABClimits tag is what makes all of these guards mutate the same set of event history in tracking frequency of passing through the guard. I had originally planned to just take the union of the limits across all the locations as the set of rate limits to enforce at each location, but that might be too subtle/confusing. That would mean, e.g., that the rate limit for all 3 scripts above is 24/day AND 5/hour (but not 10/hour for script_A -- the implementation could optimize away the longer per_hour queue).

Here are some options:

a) what I wrote above -- union the limits across uses of the tag b) allow condition: rate-limit to either use use-tag: XYZ to reference a tag or tag: XYZ to define one, or neither (implicitly a hidden unique tag). Then only when defining a tag can you specify rate limits. c) factor out a rate_limit_guard device that is shared abstraction and has an entity_id, put the limits on that device and let condition: rate-limit reference that guard device.

(In my original rate-limit-for-notifications implementation, the rate-limit applied to an already-reusable abstraction (a notification group), which side-stepped this issue a bit.)

Thoughts?

Jun 05 '20 15:06 gjbadros

I'm leaning towards "b" as that seems simplest. Any feedback?

Jun 20 '20 18:06 gjbadros

We can't have tags know where they are used, that means processing all the configuration at startup and that's too costly.

I suggest we implement this initially without tag and see how it's used and what is missing and go from there

Jun 21 '20 17:06 balloob

Good point; I hadn't thought about the startup costs.

I'll do without the tag, and instead add an arbitrary condition: guard to notify group -- that, then, can be the grouping abstraction over the places where the notifications being sent need to share the limits. It's separately a nice idea, IMO, because then it makes it easy to control notification group with other conditions (e.g., only send to this device when I'm home, or whatever).

I like this plan... I'll hold for a couple days in case there is other feedback or feedback on my related improvement to notify_group. (And because I have other things I'm busy with :-) .)

Jun 21 '20 23:06 gjbadros

Would love this condition for my Automations!

Any guidance on achieving similar rate limits on Automations with the current version of Home Assistant?

Jun 05 '22 09:06 Deagan

Also definitely looking forward to any sort of rate limiting being implemented! super useful to say the least :)

Sep 14 '22 21:09 clowrey

This architecture issue is old, stale, and possibly obsolete. Things changed a lot over the years. Additionally, we have been moving to discussions for these architectural discussions.

For that reason, I'm going to close this issue.

../Frenck

May 11 '23 14:05 frenck

architecture architecture copied to clipboard

Rate limiting in scripts via condition clause

Context

Proposal

Consequences

Alternative designs

architecture
architecture copied to clipboard