json-schema icon indicating copy to clipboard operation
json-schema copied to clipboard

UriResolver doesn't handle Windows directory separators

Open Cryszon opened this issue 8 years ago • 9 comments

Regex in UriResolver->parse() doesn't correctly parse a file path if it contains Windows directory separators. This makes the $basePath become an empty string and breaks relative $refs.

// UriResolver->resolve($uri, $baseUri = null)
$baseComponents = $this->parse($baseUri); // $baseUri = "file://C:\code\my-schema.json"
$basePath = $baseComponents['path']; // $basePath is empty.

The \ directory separators are inserted when realpath() is called on a Windows platform. The regex doesn't necessarily need to be changed (although it is a bit messy and silently returns an empty path, which made the issue hard to debug). A simple workaround is to replace directory separators when resolving the initial schema file as illustrated below.

$schema = $refResolver->resolve('file://' . str_replace('\\', '/', realpath('schema.json'));

Cryszon avatar Apr 28 '16 10:04 Cryszon

We have some tests for this to help with regression:

https://github.com/justinrainbow/json-schema/blob/9ef71fdf8aa59a93977446468fc718ea01115a83/tests/JsonSchema/Tests/Uri/UriResolverTest.php#L61-L70

It seems like we need to incorporate DIRECTORY_SEPARATOR in places like this:

https://github.com/justinrainbow/json-schema/blob/9ef71fdf8aa59a93977446468fc718ea01115a83/src/JsonSchema/Uri/UriResolver.php#L106-L128

@jojo1981 thoughts?

bighappyface avatar Apr 28 '16 14:04 bighappyface

@bighappyface I agree the UriResolver must be able to handle uri's which contains a windows path.

jojo1981 avatar Apr 29 '16 17:04 jojo1981

@bighappyface The big question here is if we need to fix this. In RFC3986 a path part may only contain forward slashes and no backslashes slashed. Also tested this behavior with: http://php.net/manual/en/function.parse-url.php

$uri = 'file://C:\code\my-schema.json';
var_dump(parse_url($uri));

bool(false)


$uri = 'file:///C:/code/my-schema.json';
var_dump(parse_url($uri));

array(2) {
  'scheme' =>
  string(4) "file"
  'path' =>
  string(22) "C:/code/my-schema.json"
}

NOTE: the extra slash in the 2nd example: "file:///C:/code/my-schema.json"

jojo1981 avatar Apr 29 '16 21:04 jojo1981

IMO the parse() itself shouldn't be responsible for handling this, since it only applies to file paths on a specific platform and would introduce a behind-the-scenes conversion of the input value. The resolver should parse whatever value is passed to it as-is based on the URI standard.

Instead I think there could be a mention about this in the documentation. In addition a notice or an error with a message showing the original URI could be triggered if the parsed scheme is file, but an empty path is returned.

Maybe the slashes could be automatically handled elsewhere, but not in the parse().

Cryszon avatar May 02 '16 09:05 Cryszon

It seems this issue is still open, and as far as I know there's no clear workaround as the issue is spawned in the JsonSchema codebase itself, because it uses realpath (on Windows). This can't be bypassed, unless I'm missing something.

Dykam avatar Mar 11 '19 12:03 Dykam

I also met a similar problem. @Cryszon @jojo1981 If your window path in __DIR__ contains something like '\02', it will lead to an incorrect path. Code here: json-schema\src\JsonSchema\Uri\UriRetriever.php (Line 344 in function translate): $uri = preg_replace('|^package://|', sprintf('file://%s/', realpath(__DIR__ . '/../../..')), $uri);

rapax87 avatar Aug 19 '19 04:08 rapax87

I think we also ran into this problem at @pronamic and were able to reproduce it via a PHPUnit test / GitHub Action that runs on windows-latest:

https://github.com/pronamic/wp-mollie/actions/runs/6097473885/job/16545183100

Perhaps this will help maintainers to address this issue, workflow file:

https://github.com/pronamic/wp-mollie/actions/runs/6097473885/workflow

JsonSchema\Exception\UriResolverException: Unable to resolve URI 'amount.json' from base ''

remcotolsma avatar Sep 07 '23 08:09 remcotolsma

@Cryszon If this is still a valid issue we are happy to pursue to see if we can support windows file paths. You inputs would be valued.

After checking rfc3986 I can see a URI cannot contain the windows directory separator. However this wikipedia article (haven't done a fact check ATM) mentions the Windows 2-slash format and 4-slash format e.g.:

file: \\server\folder\data.xml
2-slash: file://server/folder/data.xml
4-slash: file:////server/folder/data.xml

But these are still non standard URIs.

I guess in order to completely support Windows we would need active involvement of a person using Windows and some in depth knowledge of this library.

DannyvdSluijs avatar Feb 26 '24 20:02 DannyvdSluijs

The issue is not relevant for me personally as I've since moved on to WSL and Docker for my development environment.

In a perfect world every system would implement RFCs and standards accurately but the reality is that we must be prepared to make compromises to ensure a smooth user experience. Regardless of whether this issue is fixed or not, I think users should at least be made aware of it by mentioning it somewhere like an error message or in the documentation. Especially if the issue is easily fixable by user code.

While I do agree that comprehensive Windows support requires more effort beyond fixing a single directory separator issue, my capacity to assist on that scale is very limited.

Cryszon avatar Mar 01 '24 13:03 Cryszon