datasketches-cpp icon indicating copy to clipboard operation
datasketches-cpp copied to clipboard

Study to compare t-Digest and REQ sketch

Open AlexanderSaydakov opened this issue 1 year ago • 13 comments

Compare the performance of t-Digest with the closest competitor in the library, REQ sketch. REQ sketch is the closest competitor because it prioritizes high rank accuracy (HRA mode) or low rank accuracy (LRA mode), unlike other quantile sketches (KLL, classic) with the same rank error for any rank. There are a few obvious differences:

  • REQ sketch can work with any data type with a comparator, t-Digest is limited to numeric data (floating-point types)
  • REQ sketch retains and returns values observed in the input only - no notion of distance (only less than comparison), no interpolation. t-Digest is based on computing means and does interpolation.
  • t-Digest prioritizes both high rank and low rank accuracy at the same time with the default scaling function. Perhaps this can be changed with different scaling functions.

AlexanderSaydakov avatar Jan 05 '24 20:01 AlexanderSaydakov