FilePathsBase.jl
FilePathsBase.jl copied to clipboard
Idea: RelativePath as own type, independent from other details
Conceptually, under this proposal we seperate the details of a protocal specific path as part of a abolute path (e.g. PosixPath
, S3Path
, FTPPath
etc), from the simpiler relative path which can be more protocol agnostic.
AbsolutePaths
would hold the protocol, have roots, hold config etc.
A RelativePath
would be agnostic to the details of given types, and it would only have relelvent contraints (if any) imposed upon when joinpath
was done.
RelativePath
s could be used across protocols for example one could copy a file with its folder structure by splitting out a relative path of a local file, and then joining it to the remote location path.
Logically one probably should only actually be able to open
an AbsolutePath
.
One option is for some or all operations to take a RelativePath
and do join(cwd(), relpath)
.
Doing this always means probably might mean RelativePath <: SystemPath
.
One advantages include that we could automatically know that S3Path("bucket")
means "s3://bucket/"
, since there is no such thing as a relative S3Path
.
Further we will get to forbid at dispatch time calls to joinopath(::Any, ::AbsolutePath)
,
and don't have to worry about how to handle nonsense like joinpath(p"s3://foo/bar/", p"s3://foo/zap/")
In the Haskell path library, the authors seperate abolute and relative paths, and it helped a lot in terms of type-safty. Julia is not a language where that kind of type-safty really works out, since not ahead of time compiled. But I do thinkt it might provide lot of clarity to the code.
I think not having this distinction is what made this https://github.com/rofinn/FilePathsBase.jl/pull/63 and https://github.com/JuliaCloud/AWSS3.jl/pull/74/ feel odd
This is related to https://github.com/rofinn/FilePathsBase.jl/issues/71 but is a bit less of a wild idea. It is probably mutually exclusive with that idea.
don't have to worry about how to handle nonsense like joinpath(p"s3://foo/bar/", p"s3://foo/zap/")
nonsense: https://github.com/JuliaCloud/AWSS3.jl/issues/75
In the Haskell path library, the authors seperate abolute and relative paths, and it helped a lot in terms of type-safty.
https://hackage.haskell.org/package/path
I'll note that a downside of this approach is that it pushes a lot more logic into the path construction (e.g., aggressive path normalization, system calls to check if something is a path). If we were going to do this I'm guessing we'd want to encode it sort of like Distributions.jl?
abstract type Type end
struct Directory <: Type end
struct File <: Type end
abstract type Form end
struct Absolute <: Form end
struct Relative <: Form end
abstract type AbstractPath{T<:Type, F<:Form} end
abstract type SystemPath{T<:Type, F<:Form} <: AbstractPath{T, F} end
struct PosixPath{T<:Type, F<:Form} <: SystemPath{T, F}
...
end
const DirectoryPath = AbstractPath{Directory}
const FilePath = AbstractPath{File}
# If we say that relative paths can only make sense for system paths
const RelativePath{F} = SystemPath{F, Relative}
const RelativeDirectory = SystemPath{Directory, Relative}
const RelativeFile = SystemPath{File, Relative}
const AbsolutePath{F} = AbstractPath{F, Absolute}
const AbsoluteDirectory = AbstractPath{Directory, Absolute}
const AbsoluteFile = AbstractPath{File, Absolute}
Hmm
In Python's pathlib, a path is absolute if it has a root, and (if applicable) a drive.
So I suggest that perhaps an absolute path is a composite type containing both a root and a relative path.
That's kind of already the case in that a relative
path is the same type, but just has an empty root
. You're suggesting having fully independent structs?
One thing I'm not sure about is generic parsing of relative paths. Currently, we parse windows and posix paths differently even when they're relative (e.g., separator is different, drive can be present). If we had one RelativePath
that basically just wrapped the segments tuple then that wouldn't be possible. I feel like parameterizing the existing type hierarchy would make it easier for new base path types decide how relative paths should be parsed and displayed. We could still have general fallback though.
Okay, after playing with a prototype for a few days I don't think the benefits out way the extra complexity. Here is my reasoning:
-
As mentioned above, a single
RelativePath
type wouldn't even cover the 2 system path types we currently have nicely and would require aWindowsRelativePath
and aPosixRelativePath
which also couldn't be both aSystemPath
and anAbstractRelativePath
. This would become even more difficult if we want to differentiate files and directories as demonstrated in the haskell library linked. -
I think the parameterized types solution provides more of the flexibilty we'd want in terms of describing the intersection of path types, directories/files and relative/absolute paths. The problem is that the solution is rather hacky based on how julia works. I've provided some examples below:
- Extracting and reconstructing parameterized types is awkward (
fptype(src){form(src), kind(dst)}
to construct a new path using the properties of thesrc
anddst
paths). The nice part is that it's very explicit about what it returns I guess? - Doesn't seem to provide many advantages over calling
isrelative
/isabsolute
orisdir
/isfile
. - Parsing system paths needs to be very strict on directories including a trailing separator. Meaning that the
Dir
parameter just means thatisdirpath
is true andisdir
could still be false... and produce some weird bugs. The alternative would be to callisdir
during parsing, but then that seems like an expensive upfront cost. The nice counter point to this is that things likereaddir
can distinguish files and directories... but we could always change that in base anyways.
Another library worth looking at which also has some compromises. http://www.lihaoyi.com/post/HowtoworkwithFilesinScala.html#os-lib-a-simple-filesystem-library