`System.FilePath.Posix` treats filenames with leading `.`s as extensions
Right now System.FilePath.Posix.splitExtension will treat any leading . in a path as an extension. This leads to some odd behaviour:
import System.FilePath.Posix
splitExtension "myrepo/.git"
-- returns ("myrepo/", ".git")
splitExtension "myrepo/.."
-- returns ("myrepo/.",".")
I'd expect for splitExtension to treat a file whose leading character is a . as a filename. This appears to be consistent across all functions that work on paths (takeBaseName, etc). Changing these would be a pretty big breaking change, but at the very least this should probably be mentioned in the docs.
So the problem is that "file extension" is not really defined by posix. So there's no specification what constitutes a file extension.
.git could mean the filename is empty and the extension is, well .git.
Yeah, it is unfortunate that POSIX leaves so much room for interpretation.
That being said, I did a quick survey of other filepath libraries, and every one I checked has the behavior I described:
- Rust: https://doc.rust-lang.org/std/path/struct.Path.html#method.extension
- OCaml: https://ocaml.org/manual/5.4/api/Filename.html
- C++: https://en.cppreference.com/w/cpp/filesystem/path/extension
That looks like a reasonable behaviour to follow, especially if POSIX lets us make the decision.
Well, how do we assess the breaking changes?
Behavior change in filepath can easily lead to security vulnerabilities.
I haven't yet found an approach on how to do more radical changes... and I want! E.g. around UNC parsing.
One reasonable path forward would be to identify all of the problematic functions, add versions of them that treat files with a leading . as filenames instead of extensions, and then mark the old ones as deprecated. After a decently long release cycle, we could do a major-version bump that then removes the problematic functions.
We could also just add those functions and keep the old ones around with a big caveat in the docs, but this feels like leaving a rake laying around in a yard for someone unsuspecting to step on.
I think ultimately, the only solution is to maintain current behavior indefinitely and add a new System.FilePath.V2 or something. But we'd need to aggregate all the changes, because we don't want a plethora of variants.