druid
druid copied to clipboard
KLL sketch
This is a module to support new KllFloatsSketch and KllDoublesSketch from Apache DataSketches
Very cool! Thanks @AlexanderSaydakov!
Has this code been used in a production environment? Any gotches we should know about?
There appears to be a legitimate issue in KllDoublesSketchAggregatorTest.buildingSketchesAtQueryTime
when SQL-compatible null handling is enabled: https://app.travis-ci.com/github/apache/druid/jobs/569163704
[ERROR] org.apache.druid.query.aggregation.datasketches.kll.KllFloatsSketchAggregatorTest.buildingSketchesAtQueryTime[groupByConfig = v2, vectorize = true]
[ERROR] Run 1: KllFloatsSketchAggregatorTest.buildingSketchesAtQueryTime:367 expected:<377> but was:<355>
[ERROR] Run 2: KllFloatsSketchAggregatorTest.buildingSketchesAtQueryTime:367 expected:<377> but was:<355>
[ERROR] Run 3: KllFloatsSketchAggregatorTest.buildingSketchesAtQueryTime:367 expected:<377> but was:<355>
[ERROR] Run 4: KllFloatsSketchAggregatorTest.buildingSketchesAtQueryTime:367 expected:<377> but was:<355>
This code is included in our datasketches-java 3.2.0 release dated 27 Apr 2022. There are no gotches we know about, yet :)
@gianm
Has this code been used in a production environment? Any gotches we should know about?
No, this code has not been used yet. I am not aware of any gotchas. I believe I fixed that error in the test you pointed out.
FYI I haven't had a chance to take a look at this but it is in my queue. If anyone gets to review it before I do then thanks!
Two test runs failed with the below segfault in KllDoublesSketchAggregatorTest
(first time) and KllFloatsSketchAggregatorTest
(second time).
Judging by the fact it's BaseWritableMemoryImpl.getNativeOrderedLong
, it could be an out-of-bound access or a use-after-free. I tried running the test case in a loop in my IDE, and unfortunately wasn't able to repro this.
In looking at the code, I suspect it's due to a use-after-free due to returning an object from KllSketchBuildBufferAggregatorHelper.get
that references the underlying buffer. That's used to implement BufferAggregator.get
and VectorAggregator.get
, which by contract aren't permitted to return objects that reference the underlying buffer. (It may be freed, or reused, while those get'ed objects still exist.)
Similar code in other sketch helpers creates on-heap copies (hll uses copy()
, quantiles uses compact()
). Could we do the same thing here?
#
# A fatal error has been detected by the Java Runtime Environment:
#
# SIGSEGV (0xb) at pc=0x00007f69d737e924, pid=21102, tid=21103
#
# JRE version: OpenJDK Runtime Environment (11.0.2+9) (build 11.0.2+9)
# Java VM: OpenJDK 64-Bit Server VM (11.0.2+9, mixed mode, tiered, compressed oops, g1 gc, linux-amd64)
# Problematic frame:
# J 14439 c1 org.apache.datasketches.memory.internal.BaseWritableMemoryImpl.getNativeOrderedLong(J)J (38 bytes) @ 0x00007f69d737e924 [0x00007f69d737e7a0+0x0000000000000184]
#
# No core dump will be written. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again
#
# An error report file with more information is saved as:
# /home/travis/build/apache/druid/extensions-core/datasketches/hs_err_pid21102.log
Could not load hsdis-amd64.so; library not loadable; PrintAssembly is disabled
#
# If you would like to submit a bug report, please visit:
# http://bugreport.java.com/bugreport/crash.jsp
#
So close! There's a conflict in ColumnType, probably from https://github.com/apache/druid/pull/12914, which was a response to the discussion in this PR about better supporting the arrays returned by the KLL postaggs. I took the liberty of resolving the conflict using GitHub's "resolve conflict" feature.
Hmm. Another crash.
#
# A fatal error has been detected by the Java Runtime Environment:
#
# SIGSEGV (0xb) at pc=0x00007f193747fdac, pid=21257, tid=0x00007f194d6bd700
#
# JRE version: OpenJDK Runtime Environment (8.0_292-b10) (build 1.8.0_292-8u292-b10-0ubuntu1~16.04.1-b10)
# Java VM: OpenJDK 64-Bit Server VM (25.292-b10 mixed mode linux-amd64 compressed oops)
# Problematic frame:
# J 14078 C1 org.apache.datasketches.memory.internal.BaseWritableMemoryImpl.getNativeOrderedLong(J)J (38 bytes) @ 0x00007f193747fdac [0x00007f193747fc20+0x18c]
#
# Failed to write core dump. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again
#
# An error report file with more information is saved as:
# /home/travis/build/apache/druid/extensions-core/datasketches/hs_err_pid21257.log
# If you would like to submit a bug report, please visit:
# http://bugreport.java.com/bugreport/crash.jsp
#
I suspect the additional crash was due to a similar problem, but for unions. I pushed https://github.com/apache/druid/pull/12498/commits/7e117c72c86373027b39b701dd965d164716af52 to the branch which will copy those too. @AlexanderSaydakov if there's a better way to do this, please let me know (or just change it).
I suspect the additional crash was due to a similar problem, but for unions. I pushed 7e117c7 to the branch which will copy those too. @AlexanderSaydakov if there's a better way to do this, please let me know (or just change it).
Looks fine to me. Sorry, I missed that part.
Looks like the latest build passed tests. I'll merge it. Thanks @AlexanderSaydakov !