featran icon indicating copy to clipboard operation
featran copied to clipboard

Add NHeavyHitters transformer

Open martinbomio opened this issue 7 years ago • 3 comments

Could be useful to have a transformer that allows to apply a global heavy hitters to seq like attribute.

Something like NNHeavyHitters extends Transformer[Seq[A], SketchMap[String, Long], Map[String, (Int, Long)]]

martinbomio avatar May 25 '18 12:05 martinbomio

@richwhitjr WDYT?

nevillelyh avatar May 25 '18 12:05 nevillelyh

We already have something close to this. Is the idea around instead of being limited to just a String we could give a list of Strings etc? How would you like use the output? Should the sketchmap be used to return the heavy hitters back into a Seq[(String, (Int,Long))] and remove those that don't make the top N?

richwhitjr avatar May 25 '18 17:05 richwhitjr

@richwhitjr yeah, that's what I was thinking. Right now, for my use case, I will like to get a List[Int] representing the indices in the topN for each of the attributes in the input sequence, filtering out those that are not in the topN

martinbomio avatar May 29 '18 09:05 martinbomio