trycmd
trycmd copied to clipboard
Handle extended path length syntax in automatic [CWD] detection
Windows has the extended path length syntax to work around legacy file path restrictions. These paths start with the string \\?\ followed by a normal path; more details can be found here. The problem is that Rust's std::fs::canonicalize creates these kinds of paths on Windows (which makes sense given that paths over 260 characters are otherwise disallowed), but trycmd can't handle them. If my binary under test's output, on Windows, includes something like:
\\?\C:\Path\To\Working\Directory\Some\File.txt
trycmd will transform this into:
//?/[CWD]/Some/File.txt
However, on Posix-like systems, my binary's output looks like this:
/Path/To/Working/Directory/Some/File.txt
as the behaviour of fs::canonicalize is different (on all systems, canonicalization is handled by the operating system, in the case of Posix, via realpath). Therefore, trycmd "compresses" that into
[CWD]/Some/File.txt
Depending on whether the test was blessed on Windows or a Posix-like system, the tests will therefore fail on the other family of systems.
A workaround is of course to discard extended length path syntax in my binary before printing the path or file name, but as stated above that can lead to other problems. I'd like trycmd to be able to handle this; i.e. collapse the leading extended path syntax specifier into [CWD] if it is found before the normal working directory path.
How come you are showing to users the unc syntax? Normally I would expect users to see the more common output.
As for resolving this, the challenge is dealing with both kinds of path syntaxes I think I'd like to keep the last-write-wins for insert_var so things can be overriden.
We could possibly add [UNC_ROOT] and [UNC_CWD]. These will run into the same annoyances as [EXE] where running on Linux gets one behavior ([ROOT]) but running on windows will need something else ([UNC_ROOT]) and annoys developers/contributors.
How come you are showing to users the unc syntax? Normally I would expect users to see the more common output.
I'm not sure what is wrong about extended path syntax or UNC paths; some paths require it (e.g. network shares, WSL directories) and my users are developers. Also, as said, non-extended paths have some limitations and nobody knows which programs will ignore those limitations nowadays and which don't.
We could possibly add [UNC_ROOT] and [UNC_CWD]. These will run into the same annoyances as [EXE] where running on Linux gets one behavior ([ROOT]) but running on windows will need something else ([UNC_ROOT]) and annoys developers/contributors.
Even though this fix is plausible, I'd like to know what the problem is with merging that into [CWD]. After all, absolute paths to the current working directory start with a / on Unix systems and never with a \. And on Windows, non-extended paths start with A-Z, so I'm confident that all three cases can be distinguished just by considering the first couple of characters.
I'm not sure what is wrong about extended path syntax or UNC paths; some paths require it (e.g. network shares, WSL directories) and my users are developers. Also, as said, non-extended paths have some limitations and nobody knows which programs will ignore those limitations nowadays and which don't.
Yes, there are cases where UNC paths are the only representation (ignoring length). When UNC is being used for length or "just because", that is not a representation that the user expects or is used to and its better to keep the path in a user friendly way.
Even though this fix is plausible, I'd like to know what the problem is with merging that into [CWD]. After all, absolute paths to the current working directory start with a / on Unix systems and never with a . And on Windows, non-extended paths start with A-Z, so I'm confident that all three cases can be distinguished just by considering the first couple of characters.
This assumes we are doing a hard coded parsing of the content. Instead, we are doing a straight variable substitution.
This assumes we are doing a hard coded parsing of the content. Instead, we are doing a straight variable substitution.
Thanks for clearing that up, I'm not familiar with the inner workings of the library. Then I'm okay with this fix.
If we do separate variables, an interesting aspect is what we should match. I feel like I'd prefer we gave precedence to CWD and ROOT over the UNC versions but, as you saw, the UNC version includes the non-UNC version which makes that more difficult.