accumulo
accumulo copied to clipboard
Examine Hot Methods
I did some analysis of how well the JIT compiler optimizes Accumulo code by running tests locally in JMH and against a single local instance of Uno. To print what the JIT compiler was doing, I used the following java options:
-XX:+PrintCompilation
-XX:+UnlockDiagnosticVMOptions
-XX:+PrintInlining
Then I would grep the output for "accumulo" and "hot method too big". Here is the list of methods I compiled from the tests I did on both client and server.:
org.apache.accumulo.core.client.impl.TabletLocatorImpl::processInvalidated
org.apache.accumulo.core.client.impl.ThriftScanner::scan
org.apache.accumulo.core.data.Key::equals
org.apache.accumulo.core.data.thrift.TMutation$TMutationStandardScheme::read
org.apache.accumulo.core.file.rfile.RFile$LocalityGroupReader::_seek
org.apache.accumulo.core.file.rfile.RelativeKey::<init>
org.apache.accumulo.core.file.rfile.RelativeKey::readFields
org.apache.accumulo.core.security.ColumnVisibility$ColumnVisibilityParser::parse_
org.apache.accumulo.fate.zookeeper.ZooCache$2::run
org.apache.accumulo.server.constraints.MetadataConstraints::check
org.apache.accumulo.server.master.LiveTServerSet::checkServer
org.apache.accumulo.tserver.FileManager::reserveReaders
org.apache.accumulo.tserver.constraints.ConstraintChecker::check
org.apache.accumulo.tserver.scan.NextBatchTask::run
org.apache.accumulo.tserver.tablet.ScanDataSource::createIterator
org.apache.accumulo.tserver.tablet.Scanner::read
Taken from https://issues.apache.org/jira/browse/ACCUMULO-4621
I originally did this for 2.0 on Java 8. It would be good to examine these methods on Java 11 to see if they are affecting performance.
What was run as part of the benchmark? I wonder if something like this would be useful if included in the accumulo-testing repo (or somewhere else).
I don't think I did much. I think I just ingested some data. You could just run CI for a bit and get some good data.
I just re-ran this using cingest ingest
, cingest verify
and cingest scan
grep accumulo *.out | grep "hot method too big" | awk '{ print $(NF-6) }' | sort | uniq -c
1 org.apache.accumulo.core.clientImpl.TabletServerBatchWriter$MutationWriter::sendMutationsToTabletServer
4 org.apache.accumulo.core.dataImpl.thrift.TMutation$TMutationStandardScheme::read
29 org.apache.accumulo.core.data.Key::equals
10 org.apache.accumulo.core.file.rfile.bcfile.BCFile$Reader::<init>
22 org.apache.accumulo.core.file.rfile.bcfile.Utils::writeVLong
8 org.apache.accumulo.core.file.rfile.RelativeKey::fastSkip
5 org.apache.accumulo.core.file.rfile.RelativeKey::<init>
14 org.apache.accumulo.core.file.rfile.RelativeKey::readFields
4 org.apache.accumulo.core.file.rfile.RFile$LocalityGroupReader::_seek
1 org.apache.accumulo.core.file.rfile.RFile$Reader::<init>
1 org.apache.accumulo.core.iteratorsImpl.system.LocalityGroupIterator::_seek
12 org.apache.accumulo.core.iteratorsImpl.system.SourceSwitchingIterator::readNext
1 org.apache.accumulo.core.spi.compaction.DefaultCompactionPlanner::makePlan
1 org.apache.accumulo.server.constraints.MetadataConstraints::check
2 org.apache.accumulo.server.fs.FileManager::reserveReaders
3 org.apache.accumulo.server.fs.FileManager$ScanFileManager::openFiles
1 org.apache.accumulo.tserver.compactions.CompactionService::submitCompactionJob
1 org.apache.accumulo.tserver.scan.LookupTask::run
2 org.apache.accumulo.tserver.scan.NextBatchTask::run
1 org.apache.accumulo.tserver.ScanServer::reserveFiles
1 org.apache.accumulo.tserver.tablet.CompactableImpl$FileManager::getCandidates
2 org.apache.accumulo.tserver.tablet.ScanDataSource::createIterator
2 org.apache.accumulo.tserver.tablet.Scanner::read
4 org.apache.accumulo.tserver.tablet.TabletBase::nextBatch
1 org.apache.accumulo.tserver.tablet.Tablet::findSplitRow
3 org.apache.accumulo.tserver.ThriftScanClientHandler::continueScan
@milleruntime @dlmarion, do either of you have a link to a branch/repo/gist that you used to produce this output? I could try looking into some of these, but I'm not exactly sure how to verify my changes fixed anything without first being able to reproduce the report.
I just:
- created a small cluster from the
main
branch - added the following to
JAVA_OPTS
inaccumulo-env.sh
-XX:+PrintCompilation
-XX:+UnlockDiagnosticVMOptions
-XX:+PrintInlining
- ran
cingest
commands -
grep accumulo *.out | grep "hot method too big" | awk '{ print $(NF-6) }' | sort | uniq -c
in the log files directory
Also, some, or all, of these methods might not meet the criteria for inlining unless they are refactored. If you are not familiar with it, here is a good explanation.
2. added the following to
JAVA_OPTS
inaccumulo-env.sh
-XX:+PrintCompilation -XX:+UnlockDiagnosticVMOptions -XX:+PrintInlining
I was able to get this working only after adding these to all of the cases in the switch stmt:
case "$cmd" in
manager | master) JAVA_OPTS=('-Xmx512m' '-Xms512m' '-XX:+PrintCompilation' '-XX:+UnlockDiagnosticVMOptions' '-XX:+PrintInlining' "${JAVA_OPTS[@]}") ;;
monitor) JAVA_OPTS=('-Xmx256m' '-Xms256m' '-XX:+PrintCompilation' '-XX:+UnlockDiagnosticVMOptions' '-XX:+PrintInlining' "${JAVA_OPTS[@]}") ;;
gc) JAVA_OPTS=('-Xmx256m' '-Xms256m' '-XX:+PrintCompilation' '-XX:+UnlockDiagnosticVMOptions' '-XX:+PrintInlining' "${JAVA_OPTS[@]}") ;;
tserver) JAVA_OPTS=("${JAVA_OPTS[@]}" '-Xmx6G' '-Xms6G' '-XX:+PrintCompilation' '-XX:+UnlockDiagnosticVMOptions' '-XX:+PrintInlining') ;;
compaction-coordinator) JAVA_OPTS=('-Xmx512m' '-Xms512m' '-XX:+PrintCompilation' '-XX:+UnlockDiagnosticVMOptions' '-XX:+PrintInlining' "${JAVA_OPTS[@]}") ;;
compactor) JAVA_OPTS=('-Xmx256m' '-Xms256m' '-XX:+PrintCompilation' '-XX:+UnlockDiagnosticVMOptions' '-XX:+PrintInlining' "${JAVA_OPTS[@]}") ;;
*) JAVA_OPTS=('-Xmx256m' '-Xms64m' "${JAVA_OPTS[@]}") ;;
esac
I also got a different list of hot methods:
9 org.apache.accumulo.core.dataImpl.thrift.TMutation$TMutationStandardScheme::read
53 org.apache.accumulo.core.data.Key::equals
12 org.apache.accumulo.core.file.rfile.bcfile.BCFile$Reader::<init>
30 org.apache.accumulo.core.file.rfile.bcfile.Utils::writeVLong
15 org.apache.accumulo.core.file.rfile.RelativeKey::<init>
5 org.apache.accumulo.core.file.rfile.RelativeKey::readFields
3 org.apache.accumulo.core.file.rfile.RFile$LocalityGroupReader::_seek
19 org.apache.accumulo.core.iteratorsImpl.system.SourceSwitchingIterator::readNext
2 org.apache.accumulo.server.constraints.MetadataConstraints::check
3 org.apache.accumulo.server.fs.FileManager$ScanFileManager::openFiles
3 org.apache.accumulo.server.fs.FileManager::reserveReaders
1 org.apache.accumulo.tserver.scan.LookupTask::run
3 org.apache.accumulo.tserver.scan.NextBatchTask::run
5 org.apache.accumulo.tserver.tablet.ScanDataSource::createIterator
3 org.apache.accumulo.tserver.tablet.Scanner::read
3 org.apache.accumulo.tserver.tablet.TabletBase::nextBatch
3 org.apache.accumulo.tserver.ThriftScanClientHandler::continueScan
After running random walk from accumulo-testing, a few more methods popped up that I'll look into:
3 org.apache.accumulo.tserver.tablet.CompactableImpl$FileManager::getCandidates
3 org.apache.accumulo.tserver.TabletClientHandler::setUpdateTablet
4 org.apache.accumulo.tserver.compactions.CompactionService::submitCompactionJob
1 org.apache.accumulo.server.compaction.FileCompactor::openMapDataFiles
2 org.apache.accumulo.core.spi.compaction.DefaultCompactionPlanner::makePlan
5 org.apache.accumulo.core.util.LocalityGroupUtil$Partitioner::partition
2 org.apache.accumulo.core.master.thrift.TableInfo$TableInfoStandardScheme::read
2 org.apache.accumulo.core.file.rfile.MultiLevelIndex$IndexBlock::readFields
Some work was done to address these already. Is there more work to be done for this, or can this issue be closed? (Closing won't preclude us from seeking out hot methods in the future. It will just wrap up this investigation.)
I'm going to close it out and we can open a new issue if needed for other hot methods.