jdk
jdk copied to clipboard
8315767: InetAddress: constructing objects from BSD literal addresses
There are two distinct approaches to parsing IPv4 literal addresses. One is the Java baseline "strict" syntax (all-decimal d.d.d.d form family), another one is the "loose" syntax of RFC 6943 section 3.1.1 [1] (POSIX inet_addr
allowing octal and hexadecimal forms [2]). The goal of this PR is to provide interface to construct InetAddress objects from literal addresses in POSIX form, to applications that need to mimic the behavior of inet_addr
used by standard network utilities such as netcat/curl/wget and the majority of web browsers. At present time, there's no way to construct InetAddress
object from such literal addresses because the existing APIs such as InetAddress.getByName()
, InetAddress#ofLiteral()
and Inet4Address#ofLiteral()
will consume an address and successfully parse it as decimal, regardless of the octal prefix. Hence, the resulting object will point to a different IP address.
Historically InetAddress.getByName()/.getAllByName()
were the only way to convert a literal address into an InetAddress object. getAllByName()
API relies on POSIX getaddrinfo
/ inet_addr
which parses IP address segments with strtoul
(accepts octal and hexadecimal bases).
The fallback to getaddrinfo
is undesirable as it may end up with network queries (blocking mode), if inet_addr
rejects the input literal address. The Java standard explicitly says that
"If a literal IP address is supplied, only the validity of the address format is checked."
@AlekseiEfimov contributed JDK-8272215 [3] that adds new factory methods .ofLiteral()
to InetAddress
classes. Although the new API is not affected by the getaddrinfo
fallback issue, it is not sufficient for an application that needs to interoperate with external tooling that follows POSIX standard. In the current state, InetAddress#ofLiteral()
and Inet4Address#ofLiteral()
will consume the input literal address and (regardless of the octal prefix) parse it as decimal numbers. Hence, it's not possible to reliably construct an InetAddress
object from a literal address in POSIX form that would point to the desired host.
It is proposed to extend the factory methods with Inet4Address#ofPosixLiteral()
that allows parsing literal IP(v4) addresses in "loose" syntax, compatible with inet_addr
POSIX api. The implementation is based on .isBsdParsableV4()
method added along with JDK-8277608 [4]. The changes in the original algorithm are as follows:
-
IPAddressUtil#parseBsdLiteralV4()
method is extracted from.isBsdParsableV4()
- an additional check is added, whether an input string is empty
-
null
is returned whenever the original algorithm fails - a condition was added to the parser loop, that stores the IPv4 address segment value when it fits in 1 byte (0 <= x < 256)
- an additional check was added to verify that the last field value is non-negative
- when the last field value is multi-byte (the number of fields is less than 4), it is written to the last (4-(fieldNumber-1)) octets
The new method hasn't been added to InetAddress superclass because the change is only related to IPv4 addressing. This reduces the chance that client code will call the wrong factory method.
test/jdk/java/net/InetAddress/OfLiteralTest.java
was updated to include .ofPosixLiteral()
tests
Javadocs in Inet4Address
were updated accordingly
The new method can be used as follows
import java.net.InetAddress;
import java.net.Inet4Address;
public class Test {
public static void main(String[] args) throws Throwable {
if (args.length < 1) {
System.err.println("USAGE: java Test <host>");
return;
}
InetAddress ia = Inet4Address.ofPosixLiteral(args[0]);
System.out.println(ia.toString());
}
}
The output would be
$ ./build/images/jdk/bin/java Test 2130706433
/127.0.0.1
$ ./build/images/jdk/bin/java Test 02130706433
/17.99.141.27
$ ./build/images/jdk/bin/java Test 2130706438
/127.0.0.6
$ ./build/images/jdk/bin/java Test 02130706438
Exception in thread "main" java.lang.IllegalArgumentException: Invalid IP address literal: 02130706438
at java.base/sun.net.util.IPAddressUtil.invalidIpAddressLiteral(IPAddressUtil.java:169)
at java.base/java.net.Inet4Address.parseAddressStringPosix(Inet4Address.java:302)
at java.base/java.net.Inet4Address.ofPosixLiteral(Inet4Address.java:239)
at Test.main(Test.java:10)
[1] https://www.ietf.org/rfc/rfc6943.html#section-3.1.1 [2] https://pubs.opengroup.org/onlinepubs/9699919799/functions/inet_addr.html [3] https://bugs.openjdk.org/browse/JDK-8272215 [4] https://github.com/openjdk/jdk/commit/cdc1582d1d7629c2077f6cd19786d23323111018
Progress
- [x] Change must be properly reviewed (1 review required, with at least 1 Reviewer)
- [x] Change must not contain extraneous whitespace
- [ ] Change requires CSR request JDK-8329876 to be approved
- [x] Commit message must refer to an issue
Issues
- JDK-8315767: InetAddress: constructing objects from BSD literal addresses (Enhancement - P3)
- JDK-8329876: InetAddress: constructing objects from BSD literal addresses (CSR)
Reviewers
- Daniel Fuchs (@dfuch - Reviewer)
- Aleksei Efimov (@AlekseiEfimov - Reviewer)
- Michael McMahon (@Michael-Mc-Mahon - Reviewer)
Reviewing
Using git
Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/18493/head:pull/18493
$ git checkout pull/18493
Update a local copy of the PR:
$ git checkout pull/18493
$ git pull https://git.openjdk.org/jdk.git pull/18493/head
Using Skara CLI tools
Checkout this PR locally:
$ git pr checkout 18493
View PR using the GUI difftool:
$ git pr show -t 18493
Using diff file
Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/18493.diff
Webrev
:wave: Welcome back schernyshev! A progress list of the required criteria for merging this PR into master
will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.
@sercher This change now passes all automated pre-integration checks.
ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.
After integration, the commit message for the final commit will be:
8315767: InetAddress: constructing objects from BSD literal addresses
Reviewed-by: dfuchs, aefimov, michaelm, jpai
You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.
At the time when this comment was updated there had been 726 new commits pushed to the master
branch:
- 2a11e0da026066191e4d4f30b9daca986c484630: 8332743: Update comment related to JDK-8320522
- 6829d9ac67fb131462d3ef1c4bdfaa07df5d6be6: 8332122: [nmt] Totals for malloc should show total peak
- 9d332e6591334a71335da65a4dd7b2ed0482b6cb: 8307193: Several Swing jtreg tests use class.forName on L&F classes
- 98f6a80852383dcbdad7292b7d269a8547d54d45: 8332490: JMH org.openjdk.bench.java.util.zip.InflaterInputStreams.inflaterInputStreamRead OOM
- 3d4185a9ce482cc655a4c67f39cb2682b02ae4fe: 8332739: Problemlist compiler/codecache/CheckLargePages until JDK-8332654 is fixed
- c4557a7b0db5b55585b4caa7cdec81e1c1093cbc: 8332463: Byte conditional pattern case element dominates short constant case element
- d59c12fe1041a1f61f68408241a9aa4d96ac4fd2: 8329718: Incorrect
@since
tags in elements in jdk.compiler and java.compiler - b4d14540851d792b5366a3723abcea1264a5737c: 8332740: [BACKOUT] JDK-8331081 'internal proprietary API' diagnostics if --system is configured to an earlier JDK version
- 37c477856d543163b60dd2b85a5e6ac35a752211: 8332096: hotspot-ide-project fails with this-escape
- 2170e99cb49a4ef2086ecec7515a72d56148d0f2: 8331081: 'internal proprietary API' diagnostics if --system is configured to an earlier JDK version
- ... and 716 more: https://git.openjdk.org/jdk/compare/153410f480c372604e5825bfcd3a63f137e6a013...master
As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.
As you do not have Committer status in this project an existing Committer must agree to sponsor your change. Possible candidates are the reviewers of this PR (@dfuch, @AlekseiEfimov, @Michael-Mc-Mahon, @jaikiran) but any other Committer may sponsor as well.
➡️ To flag this PR as ready for integration with the above commit message, type /integrate
in a new comment. (Afterwards, your sponsor types /sponsor
in a new comment to perform the integration).
@sercher The following label will be automatically applied to this pull request:
-
net
When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.
/csr needed
@AlekseiEfimov has indicated that a compatibility and specification (CSR) request is needed for this pull request.
@sercher please create a CSR request for issue JDK-8315767 with the correct fix version. This pull request cannot be integrated until the CSR request is approved.
Webrevs
- 10: Full - Incremental (bdb8f18c)
- 09: Full - Incremental (007807ac)
- 08: Full - Incremental (664fccb1)
- 07: Full - Incremental (6a7a310c)
- 06: Full - Incremental (e07dafad)
- 05: Full - Incremental (2449a331)
- 04: Full - Incremental (d05d9d54)
- 03: Full - Incremental (005fd11e)
- 02: Full - Incremental (ebcbef43)
- 01: Full - Incremental (4230f0f2)
- 00: Full (99ba4544)
Should we be setting any expectations by specifying what InetAddress.getHostAddress()
will return for an Inet4Address
constructed using this new Inet4Address.ofPosixLiteral()
method? In its current form I believe it will continue to return a decimal representation of the IP address. My guess is that we want it to continue behaving that way?
Should we be setting any expectations by specifying what InetAddress.getHostAddress() will return for an Inet4Address constructed using this new Inet4Address.ofPosixLiteral() method? In its current form I believe it will continue to return a decimal representation of the IP address. My guess is that we want it to continue behaving that way?
In the InetAddress class level API documentation we have:
* <p> For methods that return a textual representation as output
* value, the first form, i.e. a dotted-quad string, is used.
Do you think it should be reiterated in the @apiNote
of the ofPosixLiteral
method?
Should we be setting any expectations by specifying what InetAddress.getHostAddress() will return for an Inet4Address constructed using this new Inet4Address.ofPosixLiteral() method? In its current form I believe it will continue to return a decimal representation of the IP address. My guess is that we want it to continue behaving that way?
In the InetAddress class level API documentation we have:
* <p> For methods that return a textual representation as output * value, the first form, i.e. a dotted-quad string, is used.
Do you think it should be reiterated in the
@apiNote
of theofPosixLiteral
method?
The changes proposed in this PR introduce a new paragraph in the class level documentation just before the line which states the dotted-quad string. So I think it may not be clear enough whether dotted-quad string implies decimal values (127.0.0.1
) or octal dotted-quad string (0177.0.0.1
) or hexadecimal dotted-quad string (0x7F.0.0.1
). So I think updating that sentence in class level documentation might help.
The changes proposed in this PR introduce a new paragraph in the class level documentation just before the line which states the dotted-quad string. So I think it may not be clear enough whether dotted-quad string implies decimal values (
127.0.0.1
) or octal dotted-quad string (0177.0.0.1
) or hexadecimal dotted-quad string (0x7F.0.0.1
). So I think updating that sentence in class level documentation might help.
Good point.
Should we be setting any expectations by specifying what
InetAddress.getHostAddress()
will return for anInet4Address
constructed using this newInet4Address.ofPosixLiteral()
method? In its current form I believe it will continue to return a decimal representation of the IP address. My guess is that we want it to continue behaving that way?
That seems to be the case for the native function roundtrip. So, I would it expect it to be the same in Java.
which lists the methods that parse as decimal only, and the new method which parses using the "loose" syntax.
That might not be practical: the only method that supports non decimal form is the new ofBSDLiteral
. But any other method only support decimal notation. These include URL/URI, permissions, InetSocketAdderess, InetAddress, Socket/DatagramSocket constructors, etc... Building a comprehensive list would be both unpractical and difficult.
which lists the methods that parse as decimal only, and the new method which parses using the "loose" syntax.
That might not be practical: the only method that supports non decimal form is the new
ofBSDLiteral
. But any other method only support decimal notation. These include URL/URI, permissions, InetSocketAdderess, InetAddress, Socket/DatagramSocket constructors, etc... Building a comprehensive list would be both unpractical and difficult.
I think we only need to talk about the methods in InetAddress. Though it could be phrased as ofPosixLiteral
supports decimal, hex and octal. All other parsing is decimal only.
I think we only need to talk about the methods in InetAddress. Though it could be phrased as
ofPosixLiteral
supports decimal, hex and octal. All other parsing is decimal only.
Yes. that latter formulation might be better.
Hello Sergey, do you plan to update the PR with the class level doc change suggested by Michael?
Hello Sergey, do you plan to update the PR with the class level doc change suggested by Michael?
Hi @jaikiran ! Yes I plan to update the PR. Apologies for the delay.
Thanks @Michael-Mc-Mahon, @jaikiran. I updated the class level doc so that the sentence "These forms support parts in decimal format only" is completely removed (it wasn't looking good to me any more because it was repeated by the next sentence "For example..."). I didn't add the sub-section "Parsing of literal addresses" because it already belongs to "Textual representation of IPv4 addresses" which is more or less the same. Please see the updated class level doc.
@AlekseiEfimov @jaikiran @Michael-Mc-Mahon Could you please take a look and review the CSR ?
@AlekseiEfimov @jaikiran @Michael-Mc-Mahon Could you please take a look and review the CSR ?
I will take a look at it. Thanks for the reminder
Thanks @Michael-Mc-Mahon, @jaikiran. I updated the class level doc so that the sentence "These forms support parts in decimal format only" is completely removed (it wasn't looking good to me any more because it was repeated by the next sentence "For example..."). I didn't add the sub-section "Parsing of literal addresses" because it already belongs to "Textual representation of IPv4 addresses" which is more or less the same. Please see the updated class level doc.
Ok - so now we seem to be missing the fact that the only form supported by the existing methods is the decimal form. Can we find a wording that makes this clear?
Ok - so now we seem to be missing the fact that the only form supported by the existing methods is the decimal form. Can we find a wording that makes this clear?
Would you think this should be mentioned in the class level doc of Inet4Address class? Alternatively, the method description of ofPosixLiteral
could say that "this is the only method accepting non-decimal forms of address literals". There's no such thing as "existing methods", there are "all methods consuming address literals with an exception of ofPosixLiteral
" - this could be a phrasing in the class level doc. Please let me know what you think.
missing the fact that the only form supported
I'd also note that, when mentioned in the docs that this is the only method capable of parsing this type of strings, it can be limiting to the future APIs that could introduce other methods that accept this type of strings. Adding such methods in the future will invalidate this statement and will require editing the class specification again. @dfuch Could you please help reviewing the CSR?
Thanks - LGTM now. @AlekseiEfimov, @Michael-Mc-Mahon are you also good with this version of the changes?
Thank you @AlekseiEfimov, @dfuch for your reviews!
I'd just like to have one more look at it today, if that's okay.
I read through the API docs in the latest revision (664fccb1) and it looks very good. The new method is not easily discovered, which is okay as it's very much a method for experts, I see wonder if we should put a @see
in InetAddress.ofLiteral.
The new method is not easily discovered, which is okay as it's very much a method for experts, I see wonder if we should put a
@see
in InetAddress.ofLiteral.
Thanks @AlanBateman for your note. There's a link to ofPosixLiteral
from the class level doc. I didn't want to touch ofLiteral
as it already has the link to the #format subtitle in the class doc
Thanks @AlanBateman for your note. There's a link to
ofPosixLiteral
from the class level doc. I didn't want to touchofLiteral
as it already has the link to the #format subtitle in the class doc
I think you may have mis-read my comment. I think we need to add @see InetAddress4#ofPosixLiteral(String)
to InetAddress.ofLiteral, not Inet4Address.ofLiteral. The reason is that methods defined by Inet4Address are not very discoverable and there is no reference to ofPosixLiteral in InetAddress's class description or anywhere else.
I think you may have mis-read my comment. I think we need to add
@see InetAddress4#ofPosixLiteral(String)
to InetAddress.ofLiteral, not Inet4Address.ofLiteral. The reason is that methods defined by Inet4Address are not very discoverable and there is no reference to ofPosixLiteral in InetAddress's class description or anywhere else.
Thanks @AlanBateman , apologize for mis-reading your comment. I added @see
to InetAddress.ofLiteral, please see the updated doc.
/integrate