Draft: Support Win32 long path
This PR fixes #1060.
- Add new function
XSUM_widenStringAsUncPath()incli/xsum_os_specific.c.-
XSUM_fopen()andXSUM_stat()use this new function. - It always returns absolute UNC path with
\\?\prefix.- UTF-8 path is converted to
wchar_tpath. - Relative path is converted to absolute path with
\\?\prefix.
- UTF-8 path is converted to
-
\\?\prefix automatically avoids classic DOS device quirks (e.g.COM1,NUL, etc)- I think the
\\?\prefix is supported from Windows 2000. But I can't find any source from Microsoft :(
- I think the
-
- Add new test scripts in
tests/windows/-
00-test-all.batinvokes all tests. -
tests/windows/build-with-cmake.batbuildsxxhsumwithcmake -
tests/windows/test-long-path.battests issue #1060. -
!__!tracks line number.:ERRORroutine uses it to indicate error with line number.
-
For some reason, GH actions for MinGW fails at the following line of the Makefile
https://github.com/Cyan4973/xxHash/blob/fb93aa4c414ed40de580c7140e9dad16a56749b4/Makefile#L207
Error log
OK. (passes 49948 tests)
5ffb01494ce73724 stdin
Error: unable to open input
Error: Could not open 'xxhash.c': No such file or directory.
Error: unable to open input
Error: Could not open 'xxhash.h': No such file or directory.
mingw32-make: *** [Makefile:207: check] Error 1
Error: Process completed with exit code 2.
I can't reproduce above error in my local msys environment. But here is a different issue from my local environment:
# CFLAGS=-Werror make -C tests/collisions check
Testing xxh3 algorithm (64-bit)
This program will allocate a lot of memory,
generate 3200000 64-bit hashes from samples of 256 bytes,
and attempt to produce 0 collisions.
Creating filter (0 GB)
Generate 3200000 hashes from samples of 256 bytes
Generation and filter completed in 0s , detected up to 240270 candidates
Storing hash candidates (1 MB)
assertion "checkPtr == hashCandidates" failed: file "main.c", line 772, function: search_collisions
Is this line fine?
https://github.com/Cyan4973/xxHash/blob/7aee8d0a341bb574f7c139c769e1db115b42cc3c/tests/collisions/main.c#L772
Allocation size of hashCandidates is tableSize.
size_t const tableSize = (size_t)((maxNbH+1) * hashByteSize);
...
void* const hashCandidates = malloc(tableSize);
But size for realloc() is nbCandidates * hashByteSize.
if (nbCandidates < maxNbH) {
...
void* const checkPtr = realloc(hashCandidates, nbCandidates * hashByteSize);
...
assert(checkPtr == hashCandidates); /* simplification: since we are reducing the size, ... */
}
Since nbCandidates < maxNbH, argument of relloc() is always less than its original allocation size.
It means realloc() may or may not reallocate (move) its memory region (or not).
Therefore, after this realloc(), we should not use hashCandidates.
Also, when I put printf("%p\n", hashCandidates) for debug, gcc-15 reports the following warning/error.
error: pointer 'hashCandidates' may be used after 'realloc' [-Werror=use-after-free]
This is great but also I think that Windows requires that there specifically be a manifest in the exe file. Looking into it, I think that we can simply make a file called xxhsum.manifest alongside the xxhsum.c and then cmake will automatically pick that up and link it in: https://cmake.org/cmake/help/v3.4/release/3.4.html#other
Suggested xxhsum.manifest:
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<assembly xmlns="urn:schemas-microsoft-com:asm.v1" manifestVersion="1.0" xmlns:asmv3="urn:schemas-microsoft-com:asm.v3">
<asmv3:application>
<windowsSettings xmlns:ws2="http://schemas.microsoft.com/SMI/2016/WindowsSettings">
<ws2:longPathAware>true</ws2:longPathAware>
</windowsSettings>
</asmv3:application>
<trustInfo xmlns="urn:schemas-microsoft-com:asm.v3">
<security>
<requestedPrivileges>
<requestedExecutionLevel level="asInvoker"/>
</requestedPrivileges>
</security>
</trustInfo>
<compatibility xmlns="urn:schemas-microsoft-com:compatibility.v1">
<application>
<!--The ID below indicates application support for Windows Vista -->
<supportedOS Id="{e2011457-1546-43c5-a5fe-008deee3d3f0}"/>
<!--The ID below indicates application support for Windows 7 -->
<supportedOS Id="{35138b9a-5d96-4fbd-8e2d-a2440225f93a}"/>
<!--The ID below indicates application support for Windows 8 -->
<supportedOS Id="{4a2f28e3-53b9-4441-ba9c-d69d4a4a6e38}"/>
<!--The ID below indicates application support for Windows 8.1 -->
<supportedOS Id="{1f676c76-80e1-4239-95bb-83d0f6d0da78}"/>
<!--The ID below indicates application support for Windows 10 -->
<supportedOS Id="{8e0f7a12-bfb3-4fe8-b9a5-48fd50a15a9a}"/>
</application>
</compatibility>
</assembly>
@29039 While a manifest file is convenient, my goal is to provide broader compatibility.
For example, I’d like to support “bad” Win32 filenames such as my_file. (filename with a trailing period). Since Linux/WSL (and NTFS) accepts this as a valid filename, broader compatibility could benefit other users as well. (My current implementation is incorrect, so it doesn’t work yet.)
Please also note that this patch works correctly even without a manifest file.
WIP. But now this PR supports long-path, UNC path and trailing period. e.g. my_trailing_period_file.
tests/windows/00-test-all.bat builds and tests all.
TODO:
- [ ] Ask Cyan: We can allow to use
pathcch.libto simplifiy the code. Though it's only supported Windows8+ or Windows Server 2012+. - [X] Fix CMakefile.txt for mingw which doesn't support
#pragma comment(lib)
Added cmake option XXHASH_WIN_TARGET_WIN10 for MinGW / cross-compile environment.
- cmake + MSVC user doesn't have to care about this option.
- MinGW user who need long-path support, they should pass
XXHASH_WIN_TARGET_WIN10to cmake.
When cmake uses MSVC, it automatically host environment is Windows 10+ or not.
But in the MinGW/MSYS environment, I don't know how to detect it.
Therefore, I added expcilit option for it.
This switch enables long-path code via preprocessor macro XXHSUM_WIN32_LONGPATH.
PATHCCH_DO_NOT_NORMALIZE_SEGMENTS (support trailing period and space) and PATHCCH_ENSURE_IS_EXTENDED_LENGTH_PATH (ensure \\?\ prefix) are implemented since Windows 10 RS2 1703 as operation system DLL and its SDK. Unfortunately, earlier version of Windows just ignore these flags.
Therefore, its result depends on Win10 version. (xxhsum my_file. works fine with Win10RS2>, but earlier versions of Win10 reports error "file not found")
So we need to check runtime Windows version for it. But it may be over-engineering for this small utility. Possible solutions are:
- Re-imprement them by hand. Actually, reactos have them. It also provides wider compatibility as a bonus. (It works with all NTFS Windows).
- Detect runtime Windows version and use proper flag and logic.
- Fallback to manifest as @29039 suggested.
- Just implement it and say "It's Microsoft spec, not our fault 😏 ".
Sorry, just noticed this question:
Ask Cyan: We can allow to use pathcch.lib to simplifiy the code. Though it's only supported Windows8+ or Windows Server 2012+.
I think that's reasonable
@Cyan4973 , thanks. I forgot to add @ to the question 😅
This PR is still tagged as "Draft". But at a quick glance, it already looks pretty clean.
So at which stage is it ? Still some work to do ? Or ready for review and merge ?
cc @t-mat