rapidjson Add a pointer argument to GenericSchemaDocument constructor.

A Document can embed JSON schemas but not be a schema itself as a whole (e.g. a Swagger file, see sample below).

This PR allows to create a SchemaDocument from a Pointer into the source Document (some schema part), while local $refs still resolve from the root of the source Document.

For instance, with the below swagger sample (approximative):

{
  "swagger": "2.0",
  "paths": {
    "/some/path": {
      "post": {
        "parameters": [
          {
            "in": "body",
            "name": "body",
            "schema": {
              "properties": {
                "a": {
                  "$ref": "#/definitions/SomeBodyPropertySchema"
                },
                "b": {
                  "type": "string"
                }
              },
              "type": "object"
            }
          }
        ],
        "responses": {
          "200": {
            "schema": {
              "$ref": "#/definitions/SomeResponseSchema"
            }
          }
        }
      }
    }
  },
  "definitions": {
    "SomeBodyPropertySchema": {
      "properties": {
        "e": {
          "enum": [
            "X",
            "Y",
            "Z"
          ],
          "type": "string"
        }
      },
      "type": "object"
    },
    "SomeResponseSchema": {
      "properties": {
        "c": {
          "type": "string"
        },
        "d": {
          "type": "string"
        },
      },
      "type": "object"
    }
  }
}

One can:

Document swagger;
swagger.Parse(above_json);

SchemaDocument schema1{swagger, NULL, 0, NULL, NULL, Pointer("#~1some~1path/post/parameters/0/schema")};
SchemaDocument schema2{swagger, NULL, 0, NULL, NULL, Pointer("#~1some~1path/post/responses/200/schema")};

For the change to be minimal, this first proposed patch adds an optional Pointer argument (lastly) to the existing GenericSchemaDocument constructor. It may be easier for the caller if a new GenericSchemaDocument constructor with a non optional Pointer as second argument were provided, to avoid the NULLs and zeros above...

Nov 02 '18 12:11 ylavic

All committers have signed the CLA.

Nov 02 '18 12:11 tencent-adm

Coverage increased (+0.1%) to 99.882% when pulling d7458a9dc0d6d61a8a81750b47d1e09ae6757869 on ylavic:master into 1c2c8e085a8b2561dff17bedb689d2eb0609b689 on Tencent:master.

Nov 02 '18 12:11 coveralls

The patch is incomplete, when a Pointer is given, the schema creation shouldn't be recursive, and local references should be resolved immediately (or still deferred but always taken care of at final resolutions time).

I'm fixing this and will propose a new patch soon (with a test case).

Nov 02 '18 19:11 ylavic

Fixed (hopefully) in latest proposal.

Nov 02 '18 23:11 ylavic

I will review the code in a while. Thanks.

Nov 05 '18 03:11 miloyip

@miloyip This is a very useful enhancement. I have the same requirement and am going to see if it works for my scenario. Please consider merging.

Jan 27 '21 13:01 smhdfdl

@ylavic your PR makes this change to GenericSchemaDocument constructor ...

        // Generate root schema, it will call CreateSchema() to create sub-schemas,
        // And call HandleRefSchema() if there are $ref.
        // PR #1393 use input pointer if supplied
        root_ = typeless_;
        if (pointer.GetTokenCount() == 0) {
            CreateSchemaRecursive(&root_, pointer, document, document);
        }
        else if (const ValueType* v = pointer.Get(document)) {
            CreateSchema(&root_, pointer, *v, document);
        }

Please may I ask why the second call is to CreateSchema, I would expect it to be to CreateSchemaRecursive.

Mar 03 '21 10:03 smhdfdl

Please may I ask why the second call is to CreateSchema, I would expect it to be to CreateSchemaRecursive.

Not sure where the need for CreateSchemaRecursive comes from (i.e. compile all possible object/array entries from a Document while the root schema, its subschemas and the $refs only will possibly be used), since all the subschemas will be compiled when compiling the parent Schemas, and the $refs resolved and compiled while they are found. I suppose that some Document roots are not JSON schemas for some use cases, so CreateSchemaRecursive somehow bruteforcely tries to find them (though only the first found one will be the root IIUC).

With this PR, when a Pointer is given, it's supposed to point a root schema within the Document (at least this is how I use it for openapi/swagger documents), so CreateSchema can be used directly and will find all the needed/handled subschemas and $refs.

As a side note, I think that (with this PR still) using &Pointer("#") is functionally equivalent to NULL (for most cases where the Document root is a root schema), but much more effective in terms of compilation time/resources. EDIT: about the above side note, it seems that this patch actually uses a const Pointer& instead of const Pointer* as argument in the SchemaDocument constructor, so Pointer("#") is indistinguishable from the empty/recursive case. This can easily be changed though.

Mar 03 '21 12:03 ylavic

Rebased on current master, squashed a bit, and unittest updated with the new errorCode entry.

The optional Pointer is now a pointer instead of a reference too, to distinguish empty Pointer (root) from recursive walk (still the default).

Mar 03 '21 16:03 ylavic

@ylavic Thanks for reply. I think you are right about it being a brute force approach for documents that are not pure JSON schemas. However, when you take id into account, your statement "the root schema, its subschemas and the $refs only will possibly be used" is no longer strictly true, because an id anywhere in the hierarchy changes the base for JSON pointer or plain name fragments. See the example in https://github.com/Tencent/rapidjson/pull/1848#issuecomment-789575216; note the definition of Y which uses a $ref to point at X, the JSON pointer is relative to the current base URI given by the resolved id on B. So, we have to process the hierarchy in order to derive the correct base URI.

Because this area of code is affected by id/$ref, my PR #1848 includes the changes in this PR (not sure if you spotted that).

Mar 04 '21 09:03 smhdfdl