jdk
jdk copied to clipboard
8342040: Further improve entry lookup performance for multi-release JARs
Please review this PR which speeds up JarFile::getEntry lookup significantly for multi-release JAR files.
The changes in this PR are motivated by the following insights:
META-INF/versions/is sparsely populated.- Most entries are not versioned
- The number of unique versions for each versioned entry is small
- Many JAR files are 'accidentally' multi-release; they use the feature to hide
module-info.classfrom Java 8.
Instead of performing one lookup for every version identified in the JAR, this PR narrows the version search down to an approximation of the number of versions found for the entry being looked up, which will often be zero. This speeds up lookup for non-versioned entries, and provides a more targeted search for versioned entries.
An alternative approach could be to normalize the hash code to use the none-versioned name such that versioned and non-versioned names would be resolved in the same lookup. This was quickly abandoned since the code changes were intrusive and mixed too many JAR specific concerns into ZipFile.
Testing: The existing JarFileGetEntry benchmark is updated to optionally test a multi-release JAR file with one versioned entry for module-info.class plus two other versioned class files for two distinct versions. Performance results in first comment.
Running ZipFileOpen on a multi-release JAR did not show a significat difference between this PR and mainline.
The JAR and ZIP tests are run locally. GHA results green. The noreg-perf label is added in JBS.
Progress
- [x] Change must be properly reviewed (1 review required, with at least 1 Reviewer)
- [x] Change must not contain extraneous whitespace
- [x] Commit message must refer to an issue
Issue
- JDK-8342040: Further improve entry lookup performance for multi-release JARs (Enhancement - P4)
Reviewers
- Claes Redestad (@cl4es - Reviewer)
Contributors
- Claes Redestad
<[email protected]>
Reviewing
Using git
Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/21489/head:pull/21489
$ git checkout pull/21489
Update a local copy of the PR:
$ git checkout pull/21489
$ git pull https://git.openjdk.org/jdk.git pull/21489/head
Using Skara CLI tools
Checkout this PR locally:
$ git pr checkout 21489
View PR using the GUI difftool:
$ git pr show -t 21489
Using diff file
Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/21489.diff
Webrev
:wave: Welcome back eirbjo! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.
@eirbjo This change now passes all automated pre-integration checks.
ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.
After integration, the commit message for the final commit will be:
8342040: Further improve entry lookup performance for multi-release JARs
Co-authored-by: Claes Redestad <[email protected]>
Reviewed-by: redestad
You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.
At the time when this comment was updated there had been 149 new commits pushed to the master branch:
- 28252bb48da3c960a1a261af26650d74235a9531: 8341444: Unnecessary check for JSRs in CDS
- 8174cbd5cb797a80d48246a686897ef6fe64ed57: 8341978: Improve JButton/bug4490179.java
- 9201e9fcc28cff37cf9996e8db38f9aee7511b1c: 8342409: [s390x] C1 unwind_handler fails to unlock synchronized methods with LM_MONITOR
- 0963b9e8918670badc956a325fe2ca0bf55f6d29: 8341664: ReferenceClassDescImpl cache internalName
- c51a086ce32dd4e97aa83dfba3bcf9b0636193cc: 8339694: ciTypeFlow does not correctly handle unresolved constant dynamic of array type
- 7f4ed5001efac28f02fbbb78893051e28cc33e80: 8341020: Error handler crashes when Metaspace is not fully initialized
- f50bd0d9ec65a6b9596805d0131aaefc1bb913f3: 8341513: Remove the unused thread_type field from OSThread
- 7a16906ed0dce716bc9516cb75b6450725fe9dbb: 8341134: Deprecate for removal the jrunscript tool
- ffe60919df59196d65832b8ce6b2cd38099d64df: 8173970: jar tool should have a way to extract to a directory
- 2b03dbdac4819bc0d40912f273a1ca7ab4e8715e: 8311530: Deprecate jdk.jsobject module for removal
- ... and 139 more: https://git.openjdk.org/jdk/compare/3fba1702cd8dc817b11bfa51077c41424d289281...master
As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.
➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.
@eirbjo The following labels will be automatically applied to this pull request:
core-libssecurity
When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing lists. If you would like to change these labels, use the /label pull request command.
Performance results:
Baseline:
Benchmark (mr) (size) Mode Cnt Score Error Units
JarFileGetEntry.getEntryHit false 1024 avgt 15 64.619 ? 6.633 ns/op
JarFileGetEntry.getEntryHit true 1024 avgt 15 301.770 ? 4.819 ns/op
JarFileGetEntry.getEntryHitUncached false 1024 avgt 15 82.030 ? 2.057 ns/op
JarFileGetEntry.getEntryHitUncached true 1024 avgt 15 327.966 ? 3.092 ns/op
JarFileGetEntry.getEntryMiss false 1024 avgt 15 27.937 ? 0.982 ns/op
JarFileGetEntry.getEntryMiss true 1024 avgt 15 267.280 ? 2.196 ns/op
JarFileGetEntry.getEntryMissUncached false 1024 avgt 15 53.214 ? 1.085 ns/op
JarFileGetEntry.getEntryMissUncached true 1024 avgt 15 290.904 ? 2.359 ns/op
PR:
Benchmark (mr) (size) Mode Cnt Score Error Units
JarFileGetEntry.getEntryHit false 1024 avgt 15 63.311 ? 3.014 ns/op
JarFileGetEntry.getEntryHit true 1024 avgt 15 71.333 ? 1.510 ns/op
JarFileGetEntry.getEntryHitUncached false 1024 avgt 15 80.401 ? 0.516 ns/op
JarFileGetEntry.getEntryHitUncached true 1024 avgt 15 94.810 ? 1.136 ns/op
JarFileGetEntry.getEntryMiss false 1024 avgt 15 26.717 ? 0.276 ns/op
JarFileGetEntry.getEntryMiss true 1024 avgt 15 45.677 ? 0.412 ns/op
JarFileGetEntry.getEntryMissUncached false 1024 avgt 15 53.122 ? 0.954 ns/op
JarFileGetEntry.getEntryMissUncached true 1024 avgt 15 66.583 ? 0.425 ns/op
Marking this PR as draft while investigating alternatives as proposed by @cl4es
⚠️ @eirbjo This pull request contains merges that bring in commits not present in the target repository. Since this is not a "merge style" pull request, these changes will be squashed when this pull request in integrated. If this is your intention, then please ignore this message. If you want to preserve the commit structure, you must change the title of this pull request to Merge <project>:<branch> where <project> is the name of another project in the OpenJDK organization (for example Merge jdk:master).
/contributor add cl4es
@eirbjo cl4es was not found in the census.
Syntax: /contributor (add|remove) [@user | openjdk-user | Full Name <email@address>]. For example:
/contributor add @openjdk-bot/contributor add duke/contributor add J. Duke <[email protected]>
User names can only be used for users in the census associated with this repository. For other contributors you need to supply the full name and email address.
/contributor add @cl4es
@eirbjo
Contributor Claes Redestad <[email protected]> successfully added.
Bringing this back from draft mode after some feedback and exploration with @cl4es
The current version of this PR now uses a BitSet to represent versions. This speeds up version tracking in initCEN for the unusual case that there is a large number of versioned entries in the ZIP.
To summarize the performance benefits of this PR over baseline:
- The lookup overhead of multi-release with a few versioned entries is reduced from 5X to 1.2X
- For missing entries (very common in URLClassPath!), the overhead is reduced from 10X to 1.8X
Claes has approved this PR already, but since he contributed code and ideas to this PR, I'll let this linger until early next week to allow non-Scandinavian reviewers to have a look :-)
/integrate
Going to push as commit d49f21043b84ebcc8b9176de3a84621ca7bca8fb.
Since your change was applied there have been 267 commits pushed to the master branch:
- d2e716eb72ea603fce50f0757a766ec623ef2faf: 8331958: Update PC/SC Lite for Suse Linux to 2.3.0
- a95374f588149d80068275a496ba4aa04b3bb4fd: 8343101: Rework BasicTest.testTemp test cases
- 00fe9f7bdfd245791bca6b5b1b2d0a98d41af221: 8343100: Consolidate EmptyFolderTest and EmptyFolderPackageTest jpackage tests into single java file
- 9f6d5b46ce2cfcdb39f94b8ac8621ee21f4e8740: 8343020: (fs) Add support for SecureDirectoryStream on macOS
- 1341b81321fe77005ba68fba19c7d83e3fcb5fde: 8341666: FileInputStream doesn't support readAllBytes() or readNBytes(int) on pseudo devices
- 52382e285fdf853c01605f8e0d7f3f5d34965802: 8338021: Support new unsigned and saturating vector operators in VectorAPI
- e659d9da5d6198ad9c85efd6472e138a6a3961c2: 8342975: C2: Micro-optimize PhaseIdealLoop::Dominators()
- 9f6211bcf1b46e4bfba2d128d9eb8457bc0cde51: 8341371: CDS cannot load archived heap objects with -XX:+UseSerialGC -XX:-UseCompressedOops
- 120a9357b3cf63427a6c8539128b69b11b9beca3: 8342561: Metaspace for generated reflection classes is no longer needed
- d5fb6b4a3cf4926acb333e7ee55f96fc76225631: 8339939: [JVMCI] Don't compress abstract and interface Klasses
- ... and 257 more: https://git.openjdk.org/jdk/compare/3fba1702cd8dc817b11bfa51077c41424d289281...master
Your commit was automatically rebased without conflicts.
@eirbjo Pushed as commit d49f21043b84ebcc8b9176de3a84621ca7bca8fb.
:bulb: You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.