mimir icon indicating copy to clipboard operation
mimir copied to clipboard

util: log deduplicator

Open ortuman opened this issue 2 years ago • 1 comments

Signed-off-by: Miguel Ángel Ortuño [email protected]

What this PR does

This PR adds a new deduplicator logger as an attempt to reduce pressure on the logging pipeline for more redundant log types.

Notes for reviewers

We know logging is critical, therefore the following considerations have been taken into account during the implementation:

Aiming for minimal memory allocation

The implementation makes use of a static table of entries to keep track of deduplication state. This table is allocated only once during initialization. The idea is to make use of this table as opposed to a map, so we don't have to allocate/deallocate these states at runtime. Each of the table slots can be active/inactive at any given time. When looking for a concrete entry we scan the whole table checking against the entry dedup key. Time complexity for the full table scan is assumed to be constant given the small table size and cpu cache locality.

Aiming for minimal lock contention

As the deduplication table is static (no grow/no shrink) we can define mutexes for every table slot in an attempt to reduce contention when multiple goroutines are logging.

Brief explanation of how it works

Whenever a new log type is found we register it in the entries table (dedup key) with a dedup counter of 1. When more logs of the same type come in, the dedup counter will be incremented and the last seen timestamp updated. The log entry will be eventually forwarded downstream if any of the following conditions are met:

  • Max deduplication count is reached
  • After certain period of inactivity (not seen)
  • After stopping the dedupper.

When forwarding, if the dedup counter is greater than 1, an additional dedup label will be attached as part of the log entry.

Which issue(s) this PR fixes or relates to

Fixes #1900

Checklist

  • [X] Tests updated
  • [ ] Documentation added
  • [x] CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX]

ortuman avatar Jun 07 '22 14:06 ortuman

The CHANGELOG has just been cut to prepare for the next Mimir release. Please rebase main and eventually move the CHANGELOG entry added / updated in this PR to the top of the CHANGELOG document. Thanks!

pracucci avatar Oct 07 '22 09:10 pracucci