hive-udf icon indicating copy to clipboard operation
hive-udf copied to clipboard

Approximate cardinality estimation with HyperLogLog, as a Hive function

An implementation of the HyperLogLog approximate cardinality estimation algorithm (as well as Linear Counting), as a Hive User-defined Aggregation Function (UDAF).

Relies on stream-lib for implementation of the relevant algorithms.

See the Wiki for usage instructions.