vector icon indicating copy to clipboard operation
vector copied to clipboard

Consistently rate limit internal logs

Open jszwedko opened this issue 1 year ago • 3 comments

Follow-up to https://github.com/vectordotdev/vector/issues/13185

While implementing https://github.com/vectordotdev/vector/issues/13185 it was clear there is still some ambiguities about how we do rate limiting of internal logs: which levels should have which rate limits.

Current state:

InternalEvent:

  • error - spec calls for default 10s rate limit
  • warn - undefined limit(?)
  • info - undefined limit(?)
  • debug - never rate limit
  • trace - never rate limit

log():

  • user configured, regardless of level defaults to 1

We think this would be better as:

InternalEvent:

  • Rate limit all events to 10 seconds, regardless of level, overridable by new option in https://github.com/vectordotdev/vector/issues/13185

log():

  • Default to 1 for compatibility
  • If new CLI log rate limit option is set, use this as the default instead of 1
  • Allow overriding via rate_limit_secs for compatibility

Open questions:

  • Should the rate limit be disabled if log verbosity is DEBUG or higher?
  • Should we deprecate rate_limit_secs for VRL's log and just use the configured log rate limit (defaulting to 10)

Suggested behavior of the log() function:

  • log("foo"), RATE_LIMIT= => that log rate limited at 1s
  • log("foo"), RATE_LIMIT=5 => that log rate limited at 5s
  • log("foo"), rate_limit_secs: 0), RATE_LIMIT=5 => that log is not rate limited (0 being no limit)

jszwedko avatar Sep 14 '22 19:09 jszwedko

  • Should the rate limit be disabled if log verbosity is DEBUG or higher?

If it's globally applied, would it be easier to just set the LOG_LIMIT_SEC=0?

  • Should we deprecate rate_limit_secs for VRL's log and just use the configured log rate limit (defaulting to 10)

The log() function is regularly used to expose issues with parsing in remap transforms. I feel like having that separately controlled vs Vector's events is valuable.

spencergilbert avatar Sep 14 '22 19:09 spencergilbert

  • Should the rate limit be disabled if log verbosity is DEBUG or higher?

If it's globally applied, would it be easier to just set the LOG_LIMIT_SEC=0?

That's my leaning too: that we just encourage people to disable the rate limit, in addition to increasing verbosity, when they want to see more logs.

  • Should we deprecate rate_limit_secs for VRL's log and just use the configured log rate limit (defaulting to 10)

The log() function is regularly used to expose issues with parsing in remap transforms. I feel like having that separately controlled vs Vector's events is valuable.

Yeah, that's a good point and argument for keeping the parameter 👍

jszwedko avatar Sep 14 '22 20:09 jszwedko

Realistically I don't think we need to keep log()'s default as 1 for compatibility, it shouldn't be a big deal changing that - but it's also not really painful to keep it as-is.

spencergilbert avatar Sep 14 '22 21:09 spencergilbert

Currently our InternalEvents do default to 10 when the rate limit flag is set to true at callsite. Regarding the log function:

If new CLI log rate limit option is set, use this as the default instead of 1

One way of implementing this is handling it within the tracing-limit crate, but that will increase code coupling.

arshiyasolei avatar Sep 22 '22 15:09 arshiyasolei