jdk JDK-8293711: Factor out size parsing functions from arguments.cpp

Arguments.cpp has several size parsing functions which would be useful in other areas of the hotspot, e.g. in NMT.

It would be nice to have them factored out into utilities, to reuse the code and to unify memory size handling. Gtests would be good too.

To simplify reviews, I split the patch into two commits.

The first commit (https://github.com/openjdk/jdk/pull/10252/commits/700e77e8d1469a2fc3d6611072c4b07aa34ab8e6) contains the unchanged code move without functional changes.

The second commit (https://github.com/openjdk/jdk/pull/10252/commits/76b4f6f30cc316fd966da60ff6601f54eeb394bf) contains the functional changes I did, as well as the new gtest.

Progress

[x] Change must be properly reviewed (1 review required, with at least 1 Reviewer)
[x] Change must not contain extraneous whitespace
[x] Commit message must refer to an issue

Issue

JDK-8293711: Factor out size parsing functions from arguments.cpp

Reviewers

David Holmes (@dholmes-ora - Reviewer)
Johan Sjölen (@jdksjolen - Author)

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk pull/10252/head:pull/10252
$ git checkout pull/10252

Update a local copy of the PR:
$ git checkout pull/10252
$ git pull https://git.openjdk.org/jdk pull/10252/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 10252

View PR using the GUI difftool:
$ git pr show -t 10252

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/10252.diff

Sep 13 '22 15:09 tstuefe

:wave: Welcome back stuefe! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

Sep 13 '22 15:09 bridgekeeper[bot]

@tstuefe The following label will be automatically applied to this pull request:

hotspot

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

Sep 13 '22 15:09 openjdk[bot]

Webrevs

Sep 14 '22 13:09 mlbridge[bot]

Seems like you can remove #include "metaprogramming/enableIf.hpp" from arguments.cpp

Sep 20 '22 14:09 jdksjolen

@tstuefe Please do not rebase or force-push to an active PR as it invalidates existing review comments. All changes will be squashed into a single commit automatically when integrating. See OpenJDK Developers’ Guide for more information.

Sep 20 '22 15:09 openjdk-notifier[bot]

Basic moving of the parsing functions to their own header file (modulo the naming of that file) seems okay. But I don't follow the "memory size" aspect of this refactoring. ??

I splitted the original function set into two. One parsing integers, one atop of that parsing memory sizes. These functions existed before, but I renamed the original one to parse_memory_size to make it clear what it actually does.

Before we had: A) parse_integer() -> does not parse integer, but memory sizes, e.g. "1K", "1G", etc B) parse_integer_impl() -> helper functions to parse real integers

Now we have: A) parse_memory_size() -> parses memory sizes B) parse_integer() -> parses integers

Sep 21 '22 06:09 tstuefe

Hi @dholmes-ora, @stefank,

thanks for looking at this. I worked in your feedback. @dholmes-ora: I hope my comment above made my intent a bit clearer.

Cheers, Thomas

Sep 21 '22 06:09 tstuefe

BTW, this fix intends to simplify the solution for https://github.com/openjdk/jdk/pull/10144 which then will make it possible to remove -XX:MallocMaxTestWords and its supporting code, and use NMT malloc limits instead (e.g. in compiler tests). So, the end result would hopefully be a code reduction too.

Sep 21 '22 06:09 tstuefe

Before we had: A) parse_integer() -> does not parse integer, but memory sizes, e.g. "1K", "1G", etc

Hmmm, I don't consider those "memory sizes" even if often used for such they are just ways of expressing large integers in a simple way. They are used for file sizes and data-structure sizes as well.

Sep 21 '22 06:09 dholmes-ora

Before we had: A) parse_integer() -> does not parse integer, but memory sizes, e.g. "1K", "1G", etc

Hmmm, I don't consider those "memory sizes" even if often used for such they are just ways of expressing large integers in a simple way. They are used for file sizes and data-structure sizes as well.

I'm unemotional about the name as long as its clearly different from a simple integer parsing.

How about "parse_size" ? "parse_si_size" ? (though we don't implement the full complement of si units). "parse_size_with_units" ?

Sep 21 '22 07:09 tstuefe

I'm not sure why you need to have two different parsing routines when one routine can parse everything. Things like 1K, 1M etc are defined by Unicode as "compact number formats", so you could incorporate that in the name if you had to, but I don't see why you want/need to.

Sep 21 '22 08:09 dholmes-ora

I'm not sure why you need to have two different parsing routines when one routine can parse everything. Things like 1K, 1M etc are defined by Unicode as "compact number formats", so you could incorporate that in the name if you had to, but I don't see why you want/need to.

Okay, so you want me to hide the parse_integer() variant, and possibly rename parse_memory back to parse_integer?

Sep 21 '22 08:09 tstuefe

Okay so we currently have parse_integer that calls parse_integer_impl to do the raw numeric parsing, then applies the modifiers conversions itself. I don't see why that structure needs to change just because your are refactoring the code.

Sep 21 '22 09:09 dholmes-ora

Okay so we currently have parse_integer that calls parse_integer_impl to do the raw numeric parsing, then applies the modifiers conversions itself. I don't see why that structure needs to change just because your are refactoring the code.

My intention was to give the user two utility functions, one which parses integers (stroll with some benefits) and one which I can feed memory sizes. But since this is a point of contention, I just revert that part, its not really important.

Sep 21 '22 10:09 tstuefe

parse_memory_size renamed back to parse_integer as David requested.

Sep 21 '22 12:09 tstuefe

@stefank @dholmes-ora @jdksjolen is the new version acceptable?

Sep 22 '22 17:09 tstuefe

@tstuefe This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8293711: Factor out size parsing functions from arguments.cpp

Reviewed-by: dholmes, jsjolen

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 424 new commits pushed to the master branch:

5d273b9f040a9884e2ae5b0f1409a3f9075c51c9: 8295278: Add parallel class loading tests
172006c0e9433046252bd79e8864890ab7c0ce56: 8295333: G1: Remove unnecessary check in G1Policy::calculate_desired_eden_length_by_mmu
7743345f6f73398f280fd18364b4cea10a6b0f2f: 8294314: Minimize disabled warnings in hotspot
552d8a2821f03046896a728d6e4cec0ef754d3f4: 8295192: Use original configure command line when called from a script
cf07eaeb9291da725181832b8bb1dc54957ba886: 8295020: javac emits incorrect code for for-each on an intersection type.
b3bb3e6ed89f3abcaae584fcbe75688141e886cb: 8295325: tools/jlink/plugins/SaveJlinkArgfilesPluginTest.java fails on Linux ppc64le
9005af3b90fbd3607aeb83efe1c4a6ffa5d104f0: 8295110: RISC-V: Mark out relocations as incompressible
74a51ccc86525eb4b1eb2e5cb11e605ca9e9fc5d: 8292698: Improve performance of DataInputStream
d3781ac8a38943d8a20304e770b01d5418ee33d0: 8295009: RISC-V: Interpreter intrinsify Thread.currentThread()
e7d0ab227ff86bb65abf7fbeb135ce657454200b: 8295379: ProblemList java/lang/Float/Binary16Conversion.java in Xcomp mode on x64
... and 414 more: https://git.openjdk.org/jdk/compare/9cd3e355d1f5216626daa6a9669b0c95343ca4f0...master

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

Sep 22 '22 23:09 openjdk[bot]

is the new version acceptable?

I would have preferred if the parsing code were not placed in a .hpp file, and would have placed it in a .inline.hpp file, to comply with our guidelines. Other than that, I'm OK with this patch.

Sep 23 '22 07:09 stefank

Thanks Johan. I'll wait for @stefank a bit more, but will commit later today since I'm having vacation coming up.

Sep 23 '22 07:09 tstuefe

is the new version acceptable?

I would have preferred if the parsing code were not placed in a .hpp file, and would have placed it in a .inline.hpp file, to comply with our guidelines. Other than that, I'm OK with this patch.

Almost missed your comment :-)

I'll split it up as you suggested, then push.

Sep 23 '22 14:09 tstuefe

is the new version acceptable?

I would have preferred if the parsing code were not placed in a .hpp file, and would have placed it in a .inline.hpp file, to comply with our guidelines. Other than that, I'm OK with this patch.

Almost missed your comment :-)

I'll split it up as you suggested, then push.

Wait, since the whole thing is all parsing, would that not mean we have just a parseInteger.inline.hpp, without a parseInteger.hpp? Since I'm unsure what would be left to put into parseInteger.hpp. And a parseInteger.inline.hpp without a parseInteger.hpp makes not much sense, or?

We have some other one-trick-pony-headers that are not .inline and contain their whole functionality: powerOfTwo.hpp, population_count.hpp (probably should rename that), pair.hpp...

Sep 23 '22 14:09 tstuefe

is the new version acceptable?

I would have preferred if the parsing code were not placed in a .hpp file, and would have placed it in a .inline.hpp file, to comply with our guidelines. Other than that, I'm OK with this patch.

Almost missed your comment :-) I'll split it up as you suggested, then push.

Wait, since the whole thing is all parsing, would that not mean we have just a parseInteger.inline.hpp, without a parseInteger.hpp? Since I'm unsure what would be left to put into parseInteger.hpp. And a parseInteger.inline.hpp without a parseInteger.hpp makes not much sense, or?

We have some other one-trick-pony-headers that are not .inline and contain their whole functionality: powerOfTwo.hpp, population_count.hpp (probably should rename that), pair.hpp...

Right. There's a pragmatic side to the rules w.r.t .hpp vs .inline.hpp files. If the file doesn't have dependencies to "non-trivial" header files, then we typically let it slide and put the code in the .hpp, as you have seen in the files you list. Another notable example is atomic.inline.hpp, which was renamed to atomic.hpp. I think we did that, since were too many header files that used Atomic, and it would have been a lot of work to clean that up.

The questions for me now are:

Is this header small enough that it doesn't pull in large amounts of other headers?
Is it likely to be changed in the future to include "non-trivial" headers?
Is it likely to be used in other .hpp files?

I think this it is small enough (1), will only be used in .cpp/.inline.hpp files (3), but I'm unsure about (2).

I'll leave it to your judgement to decide if this is fine to leave in the .hpp file.

Sep 26 '22 07:09 stefank

Hi @stefank,

sorry, I was gone for vacation and did not want to rush this before leaving.

is the new version acceptable?

I would have preferred if the parsing code were not placed in a .hpp file, and would have placed it in a .inline.hpp file, to comply with our guidelines. Other than that, I'm OK with this patch.

Almost missed your comment :-) I'll split it up as you suggested, then push.

Wait, since the whole thing is all parsing, would that not mean we have just a parseInteger.inline.hpp, without a parseInteger.hpp? Since I'm unsure what would be left to put into parseInteger.hpp. And a parseInteger.inline.hpp without a parseInteger.hpp makes not much sense, or? We have some other one-trick-pony-headers that are not .inline and contain their whole functionality: powerOfTwo.hpp, population_count.hpp (probably should rename that), pair.hpp...

Right. There's a pragmatic side to the rules w.r.t .hpp vs .inline.hpp files. If the file doesn't have dependencies to "non-trivial" header files, then we typically let it slide and put the code in the .hpp, as you have seen in the files you list. Another notable example is atomic.inline.hpp, which was renamed to atomic.hpp. I think we did that, since were too many header files that used Atomic, and it would have been a lot of work to clean that up.

The questions for me now are:

Is this header small enough that it doesn't pull in large amounts of other headers?

Is it likely to be changed in the future to include "non-trivial" headers?

Is it likely to be used in other .hpp files?

I think this it is small enough (1), will only be used in .cpp/.inline.hpp files (3), but I'm unsure about (2).

I'll leave it to your judgement to decide if this is fine to leave in the .hpp file.

I don't think this file will see much changes in the future. So I think we can leave the implementations in the hpp file. Thank you!

Oct 17 '22 12:10 tstuefe

/integrate

Oct 17 '22 12:10 tstuefe

Going to push as commit ec2981b83bc3ef6977b5f16d5222eb49b0ea49ad. Since your change was applied there have been 424 commits pushed to the master branch:

5d273b9f040a9884e2ae5b0f1409a3f9075c51c9: 8295278: Add parallel class loading tests
172006c0e9433046252bd79e8864890ab7c0ce56: 8295333: G1: Remove unnecessary check in G1Policy::calculate_desired_eden_length_by_mmu
7743345f6f73398f280fd18364b4cea10a6b0f2f: 8294314: Minimize disabled warnings in hotspot
552d8a2821f03046896a728d6e4cec0ef754d3f4: 8295192: Use original configure command line when called from a script
cf07eaeb9291da725181832b8bb1dc54957ba886: 8295020: javac emits incorrect code for for-each on an intersection type.
b3bb3e6ed89f3abcaae584fcbe75688141e886cb: 8295325: tools/jlink/plugins/SaveJlinkArgfilesPluginTest.java fails on Linux ppc64le
9005af3b90fbd3607aeb83efe1c4a6ffa5d104f0: 8295110: RISC-V: Mark out relocations as incompressible
74a51ccc86525eb4b1eb2e5cb11e605ca9e9fc5d: 8292698: Improve performance of DataInputStream
d3781ac8a38943d8a20304e770b01d5418ee33d0: 8295009: RISC-V: Interpreter intrinsify Thread.currentThread()
e7d0ab227ff86bb65abf7fbeb135ce657454200b: 8295379: ProblemList java/lang/Float/Binary16Conversion.java in Xcomp mode on x64
... and 414 more: https://git.openjdk.org/jdk/compare/9cd3e355d1f5216626daa6a9669b0c95343ca4f0...master

Your commit was automatically rebased without conflicts.

Oct 17 '22 12:10 openjdk[bot]

@tstuefe Pushed as commit ec2981b83bc3ef6977b5f16d5222eb49b0ea49ad.

:bulb: You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

Oct 17 '22 12:10 openjdk[bot]

jdk jdk copied to clipboard

JDK-8293711: Factor out size parsing functions from arguments.cpp

Progress

Issue

Reviewers

Reviewing

Webrevs

jdk
jdk copied to clipboard