FilePathsBase.jl icon indicating copy to clipboard operation
FilePathsBase.jl copied to clipboard

Idea: RelativePath as own type, independent from other details

Open oxinabox opened this issue 4 years ago • 8 comments

Conceptually, under this proposal we seperate the details of a protocal specific path as part of a abolute path (e.g. PosixPath, S3Path, FTPPath etc), from the simpiler relative path which can be more protocol agnostic.

AbsolutePaths would hold the protocol, have roots, hold config etc. A RelativePath would be agnostic to the details of given types, and it would only have relelvent contraints (if any) imposed upon when joinpath was done.

RelativePaths could be used across protocols for example one could copy a file with its folder structure by splitting out a relative path of a local file, and then joining it to the remote location path.

Logically one probably should only actually be able to open an AbsolutePath. One option is for some or all operations to take a RelativePath and do join(cwd(), relpath). Doing this always means probably might mean RelativePath <: SystemPath.

One advantages include that we could automatically know that S3Path("bucket") means "s3://bucket/", since there is no such thing as a relative S3Path.

Further we will get to forbid at dispatch time calls to joinopath(::Any, ::AbsolutePath), and don't have to worry about how to handle nonsense like joinpath(p"s3://foo/bar/", p"s3://foo/zap/") In the Haskell path library, the authors seperate abolute and relative paths, and it helped a lot in terms of type-safty. Julia is not a language where that kind of type-safty really works out, since not ahead of time compiled. But I do thinkt it might provide lot of clarity to the code.

I think not having this distinction is what made this https://github.com/rofinn/FilePathsBase.jl/pull/63 and https://github.com/JuliaCloud/AWSS3.jl/pull/74/ feel odd

This is related to https://github.com/rofinn/FilePathsBase.jl/issues/71 but is a bit less of a wild idea. It is probably mutually exclusive with that idea.

oxinabox avatar Apr 06 '20 13:04 oxinabox

don't have to worry about how to handle nonsense like joinpath(p"s3://foo/bar/", p"s3://foo/zap/")

nonsense: https://github.com/JuliaCloud/AWSS3.jl/issues/75

nickrobinson251 avatar Apr 06 '20 13:04 nickrobinson251

In the Haskell path library, the authors seperate abolute and relative paths, and it helped a lot in terms of type-safty.

https://hackage.haskell.org/package/path

nickrobinson251 avatar Apr 06 '20 13:04 nickrobinson251

I'll note that a downside of this approach is that it pushes a lot more logic into the path construction (e.g., aggressive path normalization, system calls to check if something is a path). If we were going to do this I'm guessing we'd want to encode it sort of like Distributions.jl?

abstract type Type end
struct Directory <: Type end
struct File <: Type end

abstract type Form end
struct Absolute <: Form end
struct Relative <: Form end

abstract type AbstractPath{T<:Type, F<:Form} end
abstract type SystemPath{T<:Type, F<:Form} <: AbstractPath{T, F} end
struct PosixPath{T<:Type, F<:Form} <: SystemPath{T, F}
    ...
end

const DirectoryPath = AbstractPath{Directory}
const FilePath = AbstractPath{File}

# If we say that relative paths can only make sense for system paths
const RelativePath{F} = SystemPath{F, Relative}
const RelativeDirectory = SystemPath{Directory, Relative}
const RelativeFile = SystemPath{File, Relative}

const AbsolutePath{F} = AbstractPath{F, Absolute}
const AbsoluteDirectory = AbstractPath{Directory, Absolute}
const AbsoluteFile = AbstractPath{File, Absolute}

rofinn avatar Apr 06 '20 15:04 rofinn

Hmm

In Python's pathlib, a path is absolute if it has a root, and (if applicable) a drive.

So I suggest that perhaps an absolute path is a composite type containing both a root and a relative path.

iamed2 avatar Apr 06 '20 22:04 iamed2

That's kind of already the case in that a relative path is the same type, but just has an empty root. You're suggesting having fully independent structs?

rofinn avatar Apr 06 '20 22:04 rofinn

One thing I'm not sure about is generic parsing of relative paths. Currently, we parse windows and posix paths differently even when they're relative (e.g., separator is different, drive can be present). If we had one RelativePath that basically just wrapped the segments tuple then that wouldn't be possible. I feel like parameterizing the existing type hierarchy would make it easier for new base path types decide how relative paths should be parsed and displayed. We could still have general fallback though.

rofinn avatar Apr 10 '20 19:04 rofinn

Okay, after playing with a prototype for a few days I don't think the benefits out way the extra complexity. Here is my reasoning:

  1. As mentioned above, a single RelativePath type wouldn't even cover the 2 system path types we currently have nicely and would require a WindowsRelativePath and a PosixRelativePath which also couldn't be both a SystemPath and an AbstractRelativePath. This would become even more difficult if we want to differentiate files and directories as demonstrated in the haskell library linked.

  2. I think the parameterized types solution provides more of the flexibilty we'd want in terms of describing the intersection of path types, directories/files and relative/absolute paths. The problem is that the solution is rather hacky based on how julia works. I've provided some examples below:

  • Extracting and reconstructing parameterized types is awkward (fptype(src){form(src), kind(dst)} to construct a new path using the properties of the src and dst paths). The nice part is that it's very explicit about what it returns I guess?
  • Doesn't seem to provide many advantages over calling isrelative/isabsolute or isdir/isfile.
  • Parsing system paths needs to be very strict on directories including a trailing separator. Meaning that the Dir parameter just means that isdirpath is true and isdir could still be false... and produce some weird bugs. The alternative would be to call isdir during parsing, but then that seems like an expensive upfront cost. The nice counter point to this is that things like readdir can distinguish files and directories... but we could always change that in base anyways.

rofinn avatar Apr 13 '20 20:04 rofinn

Another library worth looking at which also has some compromises. http://www.lihaoyi.com/post/HowtoworkwithFilesinScala.html#os-lib-a-simple-filesystem-library

rofinn avatar Apr 15 '20 18:04 rofinn