filepath icon indicating copy to clipboard operation
filepath copied to clipboard

`System.FilePath.Posix` treats filenames with leading `.`s as extensions

Open TOTBWF opened this issue 2 months ago • 6 comments

Right now System.FilePath.Posix.splitExtension will treat any leading . in a path as an extension. This leads to some odd behaviour:

import System.FilePath.Posix

splitExtension "myrepo/.git"
-- returns ("myrepo/", ".git")

splitExtension "myrepo/.."
-- returns ("myrepo/.",".")

I'd expect for splitExtension to treat a file whose leading character is a . as a filename. This appears to be consistent across all functions that work on paths (takeBaseName, etc). Changing these would be a pretty big breaking change, but at the very least this should probably be mentioned in the docs.

TOTBWF avatar Oct 30 '25 13:10 TOTBWF

So the problem is that "file extension" is not really defined by posix. So there's no specification what constitutes a file extension.

.git could mean the filename is empty and the extension is, well .git.

hasufell avatar Oct 30 '25 14:10 hasufell

Yeah, it is unfortunate that POSIX leaves so much room for interpretation.

That being said, I did a quick survey of other filepath libraries, and every one I checked has the behavior I described:

  • Rust: https://doc.rust-lang.org/std/path/struct.Path.html#method.extension
  • OCaml: https://ocaml.org/manual/5.4/api/Filename.html
  • C++: https://en.cppreference.com/w/cpp/filesystem/path/extension

TOTBWF avatar Oct 30 '25 14:10 TOTBWF

That looks like a reasonable behaviour to follow, especially if POSIX lets us make the decision.

Kleidukos avatar Oct 30 '25 14:10 Kleidukos

Well, how do we assess the breaking changes?

Behavior change in filepath can easily lead to security vulnerabilities.

I haven't yet found an approach on how to do more radical changes... and I want! E.g. around UNC parsing.

hasufell avatar Oct 30 '25 15:10 hasufell

One reasonable path forward would be to identify all of the problematic functions, add versions of them that treat files with a leading . as filenames instead of extensions, and then mark the old ones as deprecated. After a decently long release cycle, we could do a major-version bump that then removes the problematic functions.

We could also just add those functions and keep the old ones around with a big caveat in the docs, but this feels like leaving a rake laying around in a yard for someone unsuspecting to step on.

TOTBWF avatar Oct 30 '25 15:10 TOTBWF

I think ultimately, the only solution is to maintain current behavior indefinitely and add a new System.FilePath.V2 or something. But we'd need to aggregate all the changes, because we don't want a plethora of variants.

hasufell avatar Oct 31 '25 03:10 hasufell