scryer-prolog icon indicating copy to clipboard operation
scryer-prolog copied to clipboard

`path_segments/2` mangles UNC path root on Windows.

Open rotu opened this issue 8 months ago • 1 comments

On Windows, path_segments/2 splits on all slashes, even if those slashes are not directory delimiters:

?- use_module(library(files)).
   true.
?- path_canonical("C:/Windows",A),path_segments(A,B).
   A = "\\\\?\\C:\\Windows", B = [[],[],"?","C:","Windows"].
?- path_segments("\\\\127.0.0.1\\c$\\temp\\test-file.txt",B).
   B = [[],[],"127.0.0.1","c$","temp","test-file.txt"].

rotu avatar May 02 '25 22:05 rotu

I recently wrote a small library to deal with file paths. It addresses this and more deep problem – you can't really trust contents of path – because it is defined purely by the underlying file system and can have any meaning (in general case). I'll post it when it will be "publishable" :)

Example:

load_and_parse_file(UserPath, ParsedSomething) :-
    path(["/home/user",/,UserPath], S),
    parsed_something(S, ParsedSomething).

Predicate path/2 is a relation between a list of path segments and special atoms (/, root, . and ..); and a system path – which is always an opaque string. Also it gives me flexibility of handling possibly unsafe (and even malicious) user input, by applying certain "policies", for example:

load_and_parse_file(UserPath, ParsedSomething) :-
    path(np, ["/home/user",/,UserPath], S),
    parsed_something(S, ParsedSomething).

Note atom np which stands for "no-parent" this predicate guarantees not only that S is a correct system path, but also that last segment doesn't "escape" from it's parent, by using symbolic links – this can't be done without support from OS – but must be done for a nice and secure API.

I haven't invented this, I've modeled my approach on Go os.Root package https://go.dev/blog/osroot.

My implementation still suffers from so called TOCTOU attack, but it is very hard to implement in Scryer for me because I'm not proficient in Rust language.

P.S. I have never tested my library on Windows, but your example will be gracefully handled because "\\\\" is not "root", but is just a path segment which it uses as-is.

P.P.S. My notion of segments is different from path_segments/2 predicate, for me segment is any string and can contain path separators also my predicate has a preferred direction – i think it covers most realistic use-cases.

hurufu avatar Jun 02 '25 19:06 hurufu