cadence icon indicating copy to clipboard operation
cadence copied to clipboard

Add support for sampling metrics

Open ianks opened this issue 1 year ago • 6 comments

This PR adds some initial support for sampling individual metrics. I made it an opt-in feature in Cargo so folks won't pull in the rand dependency by default.

Since I'm new to the codebase, I probably did some things sub-optimally. Let me know I can improve it!

resolves #179

ianks avatar May 04 '23 03:05 ianks

Thanks! I'll try to take a first pass over this in a bit.

56quarters avatar May 04 '23 13:05 56quarters

I messed around with a slight refactor of this PR which tries a slightly different approach by moving the sampling behavior to a new type. LMK which you prefer:

diff --git i/cadence/src/types.rs w/cadence/src/types.rs
index 3ececec..ee4c227 100644
--- i/cadence/src/types.rs
+++ w/cadence/src/types.rs
@@ -8,6 +8,7 @@
 // option. This file may not be copied, modified, or distributed
 // except according to those terms.
 
+use crate::builder::SampleRate;
 use crate::builder::{MetricFormatter, MetricValue};
 use std::error;
 use std::fmt;
@@ -19,6 +20,76 @@ use std::io;
 /// types of metrics as defined in the [Statsd spec](https://github.com/b/statsd_spec).
 pub trait Metric {
     fn as_metric_str(&self) -> &str;
+
+    fn is_sampled(&self) -> bool {
+        true
+    }
+}
+
+/// Marker trait for metrics that can be sampled.
+///
+/// > A float between 0 and 1, inclusive. Only works with COUNT, HISTOGRAM,
+/// > DISTRIBUTION, and TIMER metrics. The default is 1, which samples 100% of the
+/// > time.
+/// > - via [DataDog](https://docs.datadoghq.com/developers/dogstatsd/datagram_shell)
+pub trait Sampleable: Metric {
+    /// Returns a new metric that indicates this metric was sampled.
+    fn sampled(self, sample_rate: f32) -> Result<Sampled<Self>, MetricError>
+    where
+        Self: Sized,
+    {
+        Sampled::new(self, sample_rate)
+    }
+}
+
+impl Sampleable for Counter {}
+impl Sampleable for Timer {}
+impl Sampleable for Gauge {}
+impl Sampleable for Histogram {}
+
+/// Wraps a that indicates a metric was sampled.
+pub struct Sampled<T: Sampleable> {
+    is_sampled: bool,
+    repr: String,
+    _marker: std::marker::PhantomData<T>,
+}
+
+impl<T: Sampleable> Sampled<T> {
+    pub fn new<S: TryInto<SampleRate>>(
+        metric: T,
+        sample_rate: S,
+    ) -> Result<Self, <S as std::convert::TryInto<SampleRate>>::Error> {
+        let sample_rate = sample_rate.try_into()?;
+        let mut repr = String::with_capacity(metric.as_metric_str().len() + sample_rate.kv_size());
+        repr.push_str(metric.as_metric_str());
+        repr.push('|');
+        repr.push_str(sample_rate.as_str());
+
+        #[cfg(feature = "sample-rate")]
+        use rand::Rng;
+
+        #[cfg(feature = "sample-rate")]
+        let is_sampled = rand::thread_rng().gen_bool(sample_rate.value() as f64);
+
+        #[cfg(not(feature = "sample-rate"))]
+        let is_sampled = true;
+
+        Ok(Sampled {
+            repr,
+            is_sampled,
+            _marker: std::marker::PhantomData,
+        })
+    }
+}
+
+impl<T: Sampleable> Metric for Sampled<T> {
+    fn as_metric_str(&self) -> &str {
+        &self.repr
+    }
+
+    fn is_sampled(&self) -> bool {
+        self.is_sampled
+    }
 }
 
 /// Counters are simple values incremented or decremented by a client.
@@ -307,7 +378,7 @@ pub type MetricResult<T> = Result<T, MetricError>;
 mod tests {
     #![allow(deprecated, deprecated_in_future)]
 
-    use super::{Counter, ErrorKind, Gauge, Histogram, Meter, Metric, MetricError, Set, Timer};
+    use super::{Counter, ErrorKind, Gauge, Histogram, Meter, Metric, MetricError, Sampleable, Set, Timer};
     use std::error::Error;
     use std::io;
 
@@ -421,4 +492,17 @@ mod tests {
         let our_err = MetricError::from((ErrorKind::InvalidInput, "Nope!"));
         assert!(our_err.source().is_none());
     }
+
+    #[test]
+    fn test_metrics_can_be_wrapped_as_sampled() {
+        let counter = Counter::new("my.app.", "test.counter", 4).sampled(1.0 / 3.0).unwrap();
+        let gauge = Gauge::new("my.app.", "test.gauge", 2).sampled(0.5).unwrap();
+        let histogram = Histogram::new("my.app.", "test.histogram", 45).sampled(0.5).unwrap();
+        let timer = Timer::new("my.app.", "test.timer", 34).sampled(0.5).unwrap();
+
+        assert_eq!("my.app.test.counter:4|c|@0.33333", counter.as_metric_str());
+        assert_eq!("my.app.test.timer:34|ms|@0.5", timer.as_metric_str());
+        assert_eq!("my.app.test.gauge:2|g|@0.5", gauge.as_metric_str());
+        assert_eq!("my.app.test.histogram:45|h|@0.5", histogram.as_metric_str());
+    }
 }

ianks avatar May 04 '23 14:05 ianks

Before I take a more thorough look, I've come around to unconditionally including the dependency on rand since it would allow us to get rid of all the conditional compilation. WDYT?

56quarters avatar May 04 '23 15:05 56quarters

100% agree. Most folks will have it in their dep tree already, anyway I imagine.

ianks avatar May 04 '23 16:05 ianks

Unrelated to your changes, but make sure you're running tests and linting locally since it seems like I haven't correctly set CI to run for forks.

56quarters avatar May 05 '23 16:05 56quarters

I messed around with a slight refactor of this PR which tries a slightly different approach by moving the sampling behavior to a new type. LMK which you prefer:

My preference would be for the original implementation, adding a .with_sample_rate() method to MetricBuilder. It feels more natural to me that you can only add sampling when you're "building" a metric to send, rather than applying a sample after a Metric (Counter, Gauge, etc) has already been constructed.

56quarters avatar May 05 '23 21:05 56quarters