stream-lib
stream-lib copied to clipboard
Possibly out of range
I'm using com.clearspring.analytics.stream.quantile.QDigest
class to approximate 100k datum, which is possibly summing this will result higher than int64 range. Found this when running on amazon EMR
Caused by: java.lang.IllegalArgumentException: Can only accept values in the range 0..4611686018427387903, got 9223372036854775807
at com.clearspring.analytics.stream.quantile.QDigest.offer(QDigest.java:125)
at com.liveramp.cascading_ext.combiner.lib.QuantileExactAggregator.partialAggregate(QuantileExactAggregator.java:38)
at com.liveramp.cascading_ext.combiner.lib.QuantileExactAggregator.partialAggregate(QuantileExactAggregator.java:17)
at com.liveramp.cascading_ext.combiner.CombinerFunctionContext.combineAndEvict(CombinerFunctionContext.java:130)
at com.liveramp.cascading_ext.combiner.CombinerFunction.operate(CombinerFunction.java:130)
at cascading.flow.stream.FunctionEachStage.receive(FunctionEachStage.java:99)
... 11 more
i suppose because offer method parameter defined as long, is there any work around for this?
Q-digest will cost more on every access if it uses long internally.
Having high resolution inputs is something that t-digest specifically excels at. I think some version of t-digest is included in streamlib. Recent versions are very fast and beat Q-digest accuracy dramatically, especially for high resolution inputs, for dramatic skew and for tail quantiles (which is what almost everybody wants).
On Thu, Apr 16, 2015 at 4:35 AM, Ahmad Priatama [email protected] wrote:
I'm using com.clearspring.analytics.stream.quantile.QDigest class to approximate 100k datum, which is possibly summing this will result higher than int64 range. Found this when running on amazon EMR
Caused by: java.lang.IllegalArgumentException: Can only accept values in the range 0..4611686018427387903, got 9223372036854775807 at com.clearspring.analytics.stream.quantile.QDigest.offer(QDigest.java:125) at com.liveramp.cascading_ext.combiner.lib.QuantileExactAggregator.partialAggregate(QuantileExactAggregator.java:38) at com.liveramp.cascading_ext.combiner.lib.QuantileExactAggregator.partialAggregate(QuantileExactAggregator.java:17) at com.liveramp.cascading_ext.combiner.CombinerFunctionContext.combineAndEvict(CombinerFunctionContext.java:130) at com.liveramp.cascading_ext.combiner.CombinerFunction.operate(CombinerFunction.java:130) at cascading.flow.stream.FunctionEachStage.receive(FunctionEachStage.java:99) ... 11 more
i suppose because offer method parameter defined as long, is there any work around for this?
— Reply to this email directly or view it on GitHub https://github.com/addthis/stream-lib/issues/90.