jdk
jdk copied to clipboard
8191963: Path.equals() and File.equals() return true for two different files on Windows
Replace logic in java.io.WinNTFileSystems.compare(File,File) with that from sun.nio.fs.WindowsPath.compareTo(Path).
Progress
- [ ] Change must be properly reviewed (1 review required, with at least 1 Reviewer)
- [x] Change must not contain extraneous whitespace
- [x] Commit message must refer to an issue
Issue
- JDK-8191963: Path.equals() and File.equals() return true for two different files on Windows (Bug - P4)
Reviewing
Using git
Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/25788/head:pull/25788
$ git checkout pull/25788
Update a local copy of the PR:
$ git checkout pull/25788
$ git pull https://git.openjdk.org/jdk.git pull/25788/head
Using Skara CLI tools
Checkout this PR locally:
$ git pr checkout 25788
View PR using the GUI difftool:
$ git pr show -t 25788
Using diff file
Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/25788.diff
Using Webrev
:wave: Welcome back bpb! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.
❗ This change is not yet ready to be integrated. See the Progress checklist in the description for automated requirements.
@bplb The following label will be automatically applied to this pull request:
core-libs
When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.
Just wondering how Windows implementation really behaves for its 'case-insensitive-comparing" for "\u0131 vs "I". I would assume they actually are 2 'distinguished" files?
Just wondering how Windows implementation really behaves for its 'case-insensitive-comparing" for "\u0131 vs "I". I would assume they actually are 2 'distinguished" files?
I think it depends on the locale. Also it is possible to configure NTFS, on a per directory basis, to disable the directory the case sensitivity. WindowsPath.compareTo is where we ended up for JDK 7, it would be good to validate this.
WindowsPath.compareTo is where we ended up for JDK 7, it would be good to validate this.
I assume that you intend here validating correctness with respect to the NTFS case sensitivity settings
Just wondering how Windows implementation really behaves for its 'case-insensitive-comparing" for "\u0131 vs "I". I would assume they actually are 2 'distinguished" files?
I think it depends on the locale. Also it is possible to configure NTFS, on a per directory basis, to disable the directory the case sensitivity. WindowsPath.compareTo is where we ended up for JDK 7, it would be good to validate this.
"Support for per-directory case sensitivity began in Windows 10, build 17107." [*] This is handled on the (PowerShell) command line with
fsutil.exe file setCaseSensitiveInfo <path> [enable | disable]
For the purposes of this request, the equals methods operate on the string representation of the abstract pathname and do not interact with the file system. The specification of File.equals states
On UNIX systems, alphabetic case is significant in comparing pathnames; on Microsoft Windows systems it is not.
If it were necessary to account for the directory case sensitivity setting, something like the NtQueryInformationFile function would be needed. Changing to case-sensitive would likely break a lot of code.
[*] https://learn.microsoft.com/en-us/windows/wsl/case-sensitivity
For the purposes of this request, the equals methods operate on the string representation of the abstract pathname and do not interact with the file system.
Exactly, so I think the first step here has to validate that where we ended up in WindowsPath.compareTo is correct.
Exactly, so I think the first step here has to validate that where we ended up in WindowsPath.compareTo is correct.
WindowsPath.compareTo returns zero (equal) if and only if the path strings are the same length, and the two characters in each pair of corresponding characters in the strings are either already equal, or are equal after being converted to upper case. Unless case sensitivity or locale were to be admitted, it's not clear to me whether, using the strings alone, anything substantively different could be done.
@bplb This pull request has been inactive for more than 4 weeks and will be automatically closed if another 4 weeks passes without any activity. To avoid this, simply issue a /touch or /keepalive command to the pull request. Feel free to ask for assistance if you need help with progressing this pull request towards integration!
/touch
@bplb The pull request is being re-evaluated and the inactivity timeout has been reset.
@bplb This pull request has been inactive for more than 4 weeks and will be automatically closed if another 4 weeks passes without any activity. To avoid this, simply issue a /touch or /keepalive command to the pull request. Feel free to ask for assistance if you need help with progressing this pull request towards integration!
/touch
@bplb The pull request is being re-evaluated and the inactivity timeout has been reset.
@bplb This pull request has been inactive for more than 4 weeks and will be automatically closed if another 4 weeks passes without any activity. To avoid this, simply issue a /touch or /keepalive command to the pull request. Feel free to ask for assistance if you need help with progressing this pull request towards integration!
[I]t's not clear to me whether, using the strings alone, anything substantively different could be done.
It's still unclear to me what else could be done for this issue, especially given the File.equals specification:
On UNIX systems, alphabetic case is significant in comparing pathnames; on
Microsoft Windows systems it is not. This method only tests whether the abstract
pathnames are equal; it does not access the file system and the file is not required
to exist.
We are bound to case insensitivity and not accessing the file system, unless the specification were to change.
If this none too recent MSFT blog entry is to be believed, Path.equals and the proposed changed version of File.equals match the behavior of the Windows operating system.
I think it is risky to change behavior after 20+ years. I think we got it right with WindowsPath but changing File means working through the compatibility impact and probably a system property to restore long standing behavior.
[...] changing File means working through the compatibility impact [...]
It's probably not enough, but:
java.io.WinNTFileSystem.compare(File,File)is invoked only byFile.equals.- Tier 1-3 tests passed in the mainline CI with this change.
[...] system property to restore long standing behavior.
Added in a63e13e.
/csr
@bplb has indicated that a compatibility and specification (CSR) request is needed for this pull request.
@bplb please create a CSR request for issue JDK-8191963 with the correct fix version. This pull request cannot be integrated until the CSR request is approved.
[...] system property to restore long standing behavior.
Added in a63e13e.
@naotoj, @xuemingshen-oracle, do you have any thoughts on the compatibility impact?
[...] system property to restore long standing behavior.
Added in a63e13e.
@naotoj, @xuemingshen-oracle, do you have any thoughts on the compatibility impact?
Would it be safer to make the new behavior as opt-in, as well as making the system property to be windows specific windowsCompatibleEquals or something? Making it as the default might be a bit risky to me.
Also, IIUC the test is comparing cases between 'I', '\u0130' and '\u0131'. Probably we need to cross test with 'i' too.
Would it be safer to make the new behavior as opt-in, as well as making the system property to be windows specific
windowsCompatibleEqualsor something? Making it as the default might be a bit risky to me.
Opt-in might mean that no one will ever do it, but it would make the compatibility risk zero. The suggested property name is good, or it could be jdk.io.File.windowsNIOCompatibleEquals.
Also, IIUC the test is comparing cases between 'I', '\u0130' and '\u0131'. Probably we need to cross test with 'i' too.
Acknowledged.
@naotoj Thanks for the comments.
Would it be safer to make the new behavior as opt-in,
Commit 53caf9e changes it to opt-in.
as well as making the system property to be windows specific
Also done in 53caf9e.
@naotoj The total number of required reviews for this PR (including the jcheck configuration and the last /reviewers command) is now set to 2 (with at least 2 Reviewers).
LGTM.
Thanks.
Probably someone else might want to review this too
Yeah, good idea.
If consensus is reached on this PR, I will proceed to the CSR and then the RN.