Split collection limit out of cardinality limit
As discussed in the last specification SIG meeting (2024-01-09) the existing cardinality limit is being used to represent two different limits:
- The maximum number of measurements made for distinct attribute sets within a collection cycle
- The maximum number of time-series exported at the end of the collection cycle
This distinction is meaningful for a few reasons.
- Users may not want these to be the same value. One applies to the system telemetry is being produced on, and the other applies to the downstream telemetry transmission and storage systems.
- The produced telemetry will differ base on how this limit is applied in relation to any attribute filtering.
- A user trying to remediate the limit (1) being exceeded using an attribute filter may not be able to if the implementation is filtering at the end of the collection cycle.
Proposal
- Introduce the new "collection limit" to directly set the maximum number of measurements allowed for distinct attribute sets within a collection cycle (1)
- Include recommendations for implementations to document that users should resolve "collection limit" scenarios using the instrument attribute advisory parameter
- Refine the definition of "cardinality limit" to only be the maximum number of time-series exported at the end of the collection cycle (2)
- Include recommendations for implementations to document that users should resolve "cardinality limit" scenarios using the instrument attribute advisory parameter or with an attribute filter on a view.
cc @trask @jmacd @jack-berg @jsuereth
I think I agree with this proposal. The Lightstep metrics SDK which I used for prototyping does have two limits that can be roughly described as @MrAlias has described above.
We might disagree on what "within a collection cycle" means. In my implementation, this "interior" cardinality limit is enforced between any two collection cycles by any two Readers. So -- and I admit this is not very intuitive -- when the interior limit is being reached, one way to address this is for the user to add another Reader with a shorter collection cycle. This will push the cardinality out of the interior data structure into each reader sooner, at which point the per-reader collection limit is well defined.
@utpilla looking for input.