column icon indicating copy to clipboard operation
column copied to clipboard

how do I efficiently query for unique values of a field

Open ami-m opened this issue 3 years ago • 2 comments

say I get a stream of data: {machineCode: "", lat: , lon: } And I want to display a count of such datums per machineCode.

Is there a way to efficiently get all the unique machine codes? or should I just keep track of them while inserting data?

ami-m avatar Dec 09 '22 21:12 ami-m

No built-in feature in column for this, but there's 2 ways I can think of to solve this problem:

  1. if you're okay with imprecise measurement, use HyperLogLog to store machine codes
  2. otherwise, a standard map/set is required

You can do both during insertion or a range query that iterates over all elements.

kelindar avatar Dec 11 '22 14:12 kelindar

thanks, I went with the second method, but that leaves me with having to do the range query when restoring state from a snapshot :-(

ami-m avatar Dec 12 '22 12:12 ami-m