filepath icon indicating copy to clipboard operation
filepath copied to clipboard

isValid "\\\\?\\UNC\\"

Open Bodigrim opened this issue 2 years ago • 7 comments

> System.FilePath.Windows.isValid "\\\\?\\UNC\\"
True
> putStrLn "\\\\?\\UNC\\"
\\?\UNC\

I think this is wrong: \\?\UNC\ is incomplete, it is nether file nor folder name.

https://github.com/haskell/filepath/blob/98f8bba9eac8c7183143d290d319be7df76c258b/System/FilePath/Internal.hs#L1065-L1067

If we are in agreement that isValid should return False on this input, there is a harder question ahead. What should be the output of makeValid? Something like \\?\UNC\_\_?

Bodigrim avatar Mar 11 '23 01:03 Bodigrim

Related: https://github.com/haskell/filepath/issues/92

isValid is a hot mess on windows.

I'm not sure how much improvement we can drive here with ad-hoc bugfixes.

The underlying problem is that we're not parsing windows filepaths, although there are pieces that allow us to put together a proper grammar:

  • https://github.com/haskell/filepath/pull/99
  • https://docs.microsoft.com/en-us/openspecs/windows_protocols/ms-dtyp/62e862f4-2a51-452e-8eeb-dc4ff5ee33cc?redirectedfrom=MSDN

With that we could implement a more meaningful version of isValid.

hasufell avatar Mar 11 '23 02:03 hasufell

https://github.com/haskell/filepath/blob/16e5374620189b27eca1eed09642ec02b2222fc8/System/FilePath/Internal/Parser.hs#L26-L61

hasufell avatar Mar 11 '23 02:03 hasufell

I'm not sure how much improvement we can drive here with ad-hoc bugfixes.

I agree. My bigger concern is that while at least in theory isValid could be made correct, makeValid is fundamentally broken on Windows. It's not like you can meaningfully repair any Windows path at all. Even current behaviour makeValid "test*" == "test_" is a bit of WAAAAT? Maybe mark it as deprecated?..

Bodigrim avatar Mar 11 '23 10:03 Bodigrim

Ok, so things are a little more complicated on windows wrt "\\\\?\\UNC\\".

These are not statically assigned special names afaiu. Instead those are some form of object symlinks that are maintained inside of windows (and can be viewed in the WinObj browser tool). Also see: https://learn.microsoft.com/en-us/windows/win32/fileio/naming-a-file#nt-namespaces

There are many more, e.g. look at:

\\?\UNC\localhost\c$\foo\bar                       -> \\localhost\c$\foo\bar
\\?\GLOBALROOT\GLOBAL??\UNC\localhost\c$\foo\bar   -> \\localhost\c$\foo\bar
\\?\HarddiskVolume2\foo\bar                        -> C:\foo\bar (if HarddiskVolume2 is C:)
\\?\GLOBALROOT\GLOBAL??\HarddiskVolume2\foo\bar    -> C:\foo\bar (if HarddiskVolume2 is C:)
\\?\GLOBALROOT\Device\Harddisk0\Partition2\foo\bar -> C:\foo\bar (if Harddisk0\Partition2 is C:)

(all the above are somewhat equal)

The fact that filepath as a library treats \\\\?\\UNC\\ special is in my opinion more of a wart than a feature. I don't consider \\\\?\\UNC\\ a special case in my grammar. The meaning of those object links can only fully be understood when performing IO. Some of them may be somewhat conventional, but still...

Maybe @Mistuke has another opinion.

hasufell avatar Mar 11 '23 11:03 hasufell

AFAIU https://learn.microsoft.com/en-us/dotnet/standard/io/file-path-formats#dos-device-paths, \\?\UNC\ is a special case. Namely, Windows filenames can be:

  • Traditional DOS paths, C:\foo\bar
  • UNC paths, which is a confusing name, because they are commonly known as "shared drive paths", \\server\share\file.
  • DOS device paths, which are an attempt at universal resource identification going beyond file system, roughly approaching Unix model. These start from \\.\, followed by resource name.

Now there is a bit of confusion. If you want to format a traditional DOS path as a device path, you can just append \\.\ to C:\foo\bar, obtaining \\.\C:\foo\bar. The same does not apply for UNC paths to shared drives, because you end up with \\.\\server\share\file and device paths are not supposed to contain \\ anywhere except the beginning. To overcome this restriction Windows introduces a workaround: instead of \\.\\server\share\file you are supposed to write \\.\UNC\server\share\file. So this is a special syntax.

Bodigrim avatar Mar 11 '23 12:03 Bodigrim

So this is a special syntax.

It's not syntax, those are simply symbolic links. Again, there's also \\?\GLOBALROOT\GLOBAL??\UNC ...why we don't support that form? We can even do \\?\\GLOBALROOT\Device\Mup\localhost\c$\foo\bar.

UNC

hasufell avatar Mar 11 '23 12:03 hasufell

The fact that filepath as a library treats \\?\UNC\ special is in my opinion more of a wart than a feature. I don't consider \\?\UNC\ a special case in my grammar. The meaning of those object links can only fully be understood when performing IO. Some of them may be somewhat conventional, but still...

FWIW I agree, Inside GHC's handling we only really treat \\?\ and \\.\ as special.

Mistuke avatar Mar 12 '23 19:03 Mistuke