json-schema-validator icon indicating copy to clipboard operation
json-schema-validator copied to clipboard

A URI with 'file' protocol is not handled as it should

Open Scanframe opened this issue 2 years ago • 6 comments
trafficstars

Problem

  • Directory Structure & Command
    • Files
    • Command
      • Python
      • C++ json-schema-validator
    • Main Schema File
  • Error Log

When the $id is set to use a file protocol like in this case file:///mnt/server/userdata/source/json-schemas/schema/customer.schema.json an error is reported when other schema files are referenced for definitions.

As a comparison the validator from the Linux package python3-jsonschema only allows file:// protocol for local files which is the most logical in my opinion. (The problem there is that it does not handle relative file paths.)

Directory Structure & Command

Files

<project-dir>
├── json
│   ├── test.customer.json
└── schema
    ├── address.schema.json
    ├── customer.schema.json
    └── defs.schema.json

Command

Both commands are executed when the current directory is the project root.

Python

jsonschema -i json/test.customer.json schema/customer.schema.json

C++ json-schema-validator

json-schema-validate schema/customer.schema.json < json/test.customer.json

Main Schema File

The file below references other files. Those files can be found at this location.

{
	"$id": "file:///mnt/server/userdata/source/json-schemas/schema/customer.schema.json",
	"$schema": "http://json-schema.org/draft-07/schema#",
	"type": "object",
	"additionalProperties": false,
	"properties": {
		"name": {
			"type": "object",
			"additionalProperties": false,
			"properties": {
				"first": {
					"$ref": "defs.schema.json#/definitions/firstName"
				},
				"middle": {
					"$ref": "defs.schema.json#/definitions/middleName"
				},
				"last": {
					"$ref": "defs.schema.json#/definitions/lastName"
				}
			},
			"required": [
				"first",
				"middle",
				"last"
			]
		},
		"shipping_address": {
			"$ref": "address.schema.json"
		},
		"billing_address": {
			"$ref": "address.schema.json"
		},
		"parcel_size": {
			"type": "object",
			"additionalProperties": false,
			"properties": {
				"height": {
					"$ref": "defs.schema.json#/definitions/parcelSizeHeight"
				},
				"width": {
					"$ref": "defs.schema.json#/definitions/parcelSizeWidth"
				},
				"depth": {
					"$ref": "defs.schema.json#/definitions/parcelSizeDepth"
				}
			}
		}
	},
	"required": [
		"name",
		"shipping_address",
		"billing_address",
		"parcel_size"
	]
}

Error Log

setting root schema failed
could not open file:///mnt/server/userdata/source/json-schemas/schema/address.schema.json tried with .//mnt/server/userdata/source/json-schemas/schema/address.schema.json
ERROR: '"/billing_address"' - '{"city":"'s-Gravenhage","postal_code":"2514GL","state":"Zuid-Holland","street_address":"Noordeinde 68"}': unresolved or freed schema-reference file:///mnt/server/userdata/source/json-schemas/schema/address.schema.json # 
ERROR: '"/name/first"' - '"Prins"': unresolved or freed schema-reference file:///mnt/server/userdata/source/json-schemas/schema/defs.schema.json # /definitions/firstName
ERROR: '"/name/last"' - '"Oranje"': unresolved or freed schema-reference file:///mnt/server/userdata/source/json-schemas/schema/defs.schema.json # /definitions/lastName
ERROR: '"/name/middle"' - '"van"': unresolved or freed schema-reference file:///mnt/server/userdata/source/json-schemas/schema/defs.schema.json # /definitions/middleName
ERROR: '"/parcel_size/depth"' - '30': unresolved or freed schema-reference file:///mnt/server/userdata/source/json-schemas/schema/defs.schema.json # /definitions/parcelSizeDepth
ERROR: '"/parcel_size/height"' - '200': unresolved or freed schema-reference file:///mnt/server/userdata/source/json-schemas/schema/defs.schema.json # /definitions/parcelSizeHeight
ERROR: '"/parcel_size/width"' - '80': unresolved or freed schema-reference file:///mnt/server/userdata/source/json-schemas/schema/defs.schema.json # /definitions/parcelSizeWidth
ERROR: '"/shipping_address"' - '{"city":"'s-Gravenhage","postal_code":"2513BJ","state":"Zuid-Hoilland","street_address":"Molenstraat 27"}': unresolved or freed schema-reference file:///mnt/server/userdata/source/json-schemas/schema/address.schema.json # 
schema validation failed

Scanframe avatar Feb 27 '23 13:02 Scanframe

The validator program you're using is just a test program, an example showing how to use the library.

The simple loader-callback actually is doing a good work, because it uses the URL-path of the root-schema to find the other sub-schemas.

If you use the library please write your own loader-script matching your infrastructure.

To solve your problem validator library needs to be aware of the initial filename and path of the root-schema. As of today it isn't. It seems the python one is doing that.

If you don't want to integrate the library in your program and just want to use an executable, why not stick with the python one?

Otherwise, do not hesitate to suggest a patch for the example so that it does what you want.

Btw. Isn't it very strange that the $id-tag contains a local file path?

pboettch avatar Feb 27 '23 22:02 pboettch

Thanks for responding.

Btw. Isn't it very strange that the $id-tag contains a local file path?

When your system/application has no access to webservers then this is the only option.

I assumed the $id-tag in the main schema can only contain a URI to identify its resource. The other linked or referenced schemas can use a relative location to the main one.

I tried fix it in the code but a path is prefixed with ./ which is good for the http protocol but not for the file protocol. It became too complex from there to figure out what to change in a short time to make it work.

Scanframe avatar Mar 03 '23 11:03 Scanframe

Btw. Isn't it very strange that the $id-tag contains a local file path?

When your system/application has no access to webservers then this is the only option.

No, it seems common usage to put http-addresses as $ids, even though nothing is looking up anything on the internet.

I tried fix it in the code but a path is prefixed with ./ which is good for the http protocol but not for the file protocol. It became too complex from there to figure out what to change in a short time to make it work.

The library also does not support ../-relative path references. This might be related. Someone with time needs to take a look.

pboettch avatar Mar 03 '23 11:03 pboettch

No, it seems common usage to put http-addresses as $ids, even though nothing is looking up anything on the internet.

My understanding is it when the $id is omitted the from the main schema file secondary referenced schema files are not found at all. The $id sets the location where the other schema files are to be found. When using only a single schema file nothing in the $id tag matters since nothing is externally referenced.

Scanframe avatar Mar 03 '23 12:03 Scanframe

The other schema-validators I saw all use callbacks for the user to handle the loading of additional schemas. So, it's up to the application handling the evaluation of the URL of $id.

The problem you have is, that file:// is not (correctly) handled in the URL-class (probably).

OK, but you are also using an example program which is not really designed to be generic. Maybe we can fix it there? In the loader callback, if the protocol is file, we remove the .?

pboettch avatar Mar 03 '23 12:03 pboettch

OK, but you are also using an example program which is not really designed to be generic. Maybe we can fix it there? In the loader callback, if the protocol is file, we remove the .?

I can make a contribution trying to fixing it.

BTW...

I used FetchContent_xxxxx CMake functions instead of the Hunter ones. CMake V3.11 is needed for it at least.

file: cmake/nlohmann_jsonConfig.cmake

# FetchContent added in CMake 3.11, downloads during the configure step
include(FetchContent)
# Import Json library.
FetchContent_Declare(
	nlohmann-json
	GIT_REPOSITORY https://github.com/nlohmann/json
	GIT_TAG v3.8.0
	)
# Adds nlohmann_json::nlohmann_json
FetchContent_MakeAvailable(nlohmann-json)

Addition in main CMakeLists.txt

# Make it so our own packages are found and also the ones in the sub-module library.
list(APPEND CMAKE_PREFIX_PATH "${CMAKE_CURRENT_LIST_DIR}/cmake")

Scanframe avatar Mar 03 '23 13:03 Scanframe