icu4x icon indicating copy to clipboard operation
icu4x copied to clipboard

add `schemars` to `ZeroVec`

Open ashu26jha opened this issue 1 year ago • 5 comments

This PR intends to add schemars support to ZeroVec. Implemented JsonSchema for following:

  1. ZeroVec
  2. VarZeroVec
  3. ZeroSlice
  4. ZeroVecError

Added a basic test, just to check the generation. The JSON generated from the test:

{
  "$schema": "http://json-schema.org/draft-07/schema#",
  "title": "DataStruct",
  "type": "object",
  "required": [
    "chars",
    "nested_numbers",
    "nums",
    "strs"
  ],
  "properties": {
    "chars": {
      "$ref": "#/definitions/ZeroVec<Character>"
    },
    "nested_numbers": {
      "$ref": "#/definitions/VarZeroVec<ZeroSlice<uint32>>"
    },
    "nums": {
      "$ref": "#/definitions/ZeroVec<uint32>"
    },
    "strs": {
      "$ref": "#/definitions/VarZeroVec<String>"
    }
  },
  "definitions": {
    "VarZeroVec<String>": {
      "type": "array",
      "items": {
        "type": "string"
      }
    },
    "VarZeroVec<ZeroSlice<uint32>>": {
      "type": "array",
      "items": {
        "$ref": "#/definitions/ZeroSlice<uint32>"
      }
    },
    "ZeroSlice<uint32>": {
      "type": "object",
      "required": [
        "data"
      ],
      "properties": {
        "data": {
          "type": "array",
          "items": {
            "type": "integer",
            "format": "uint8",
            "minimum": 0.0
          }
        },
        "error": {
          "$ref": "#/definitions/ZeroVecError"
        }
      }
    },
    "ZeroVec<Character>": {
      "type": "array",
      "items": {
        "type": "string",
        "maxLength": 1,
        "minLength": 1
      }
    },
    "ZeroVec<uint32>": {
      "type": "array",
      "items": {
        "type": "integer",
        "format": "uint32",
        "minimum": 0.0
      }
    },
    "ZeroVecError": {
      "description": "ZeroVecError is an enum representing errors that can occur during the decoding of slices of ULE",
      "type": "string",
      "enum": [
        "InvalidLength",
        "ParseError",
        "VarZeroVecFormatError"
      ]
    }
  }
}

Marking this part of #2118, as it may require more work.

Implement JsonSchema:

ashu26jha avatar Apr 10 '24 16:04 ashu26jha

  "properties": {
    "chars": {
      "$ref": "#/definitions/ZeroVec<Character>"
    },
    "nested_numbers": {
      "$ref": "#/definitions/VarZeroVec<ZeroSlice<uint32>>"
    },
    "nums": {
      "$ref": "#/definitions/ZeroVec<uint32>"
    },
    "strs": {
      "$ref": "#/definitions/VarZeroVec<String>"
    }
  },

While technically correct, I think I would expect these to be inlined at the call site rather than being stuffed into the definitions section. I imagine the definitions section being useful when you have deeply nested structs or a struct that gets used multiple times, but these vecs should just be vecs.

Is this configurable in the JsonSchema generator, or is it a result of the JsonSchema trait impl?

sffc avatar Apr 12 '24 06:04 sffc

In other words, I think my expectation for the JSON Schema of your sample struct would be

{
  "$schema": "http://json-schema.org/draft-07/schema#",
  "title": "DataStruct",
  "type": "object",
  "required": [
    "chars",
    "nested_numbers",
    "nums",
    "strs"
  ],
  "properties": {
    "chars": {
      "type": "array",
      "items": {
        "type": "string",
        "maxLength": 1,
        "minLength": 1
      }
    },
    "nested_numbers": {
      "type": "array",
      "items": {
        "type": "array",
        "items": {
          "type": "integer",
          "format": "uint32",
          "minimum": 0.0
        }
      }
    },
    "nums": {
      "type": "array",
      "items": {
        "type": "integer",
        "format": "uint32",
        "minimum": 0.0
      }
    },
    "strs": {
      "type": "array",
      "items": {
        "type": "string"
      }
    }
  }
}

sffc avatar Apr 12 '24 06:04 sffc

While technically correct, I think I would expect these to be inlined at the call site rather than being stuffed into the definitions section. I imagine the definitions section being useful when you have deeply nested structs or a struct that gets used multiple times, but these vecs should just be vecs. Is this configurable in the JsonSchema generator, or is it a result of the JsonSchema trait impl?

@sffc Found something in the docs and in thier codebase:

What does docs mentions: is_referenceable() returns false for simpler schemas, but for complex/recursive ones it returns true. But the schema I tested is probably the simpliest of our use cases, not sure why we are getting references. is_referenceable(), $ref

What does code mentions: Under SchemaSettings we do indeed have inline_subschemas by default this is false, we can set this to true. Few references still be generated for recursive type (We do want to keep this).

I could give setting inline_subschemas to true a shot and see how it pans out.

In other words, I think my expectation for the JSON Schema of your sample struct would be

Yes indeed, myself not a fan of very big schemas. Having schemas too big basically defies the purpose.

ashu26jha avatar Apr 12 '24 18:04 ashu26jha

@sffc @Manishearth

New JSON, I was able to generate with setting inline_subschemas to true :

{
  "$schema": "http://json-schema.org/draft-07/schema#",
  "title": "DataStruct",
  "type": "object",
  "required": [
    "chars",
    "nested_numbers",
    "nums",
    "strs"
  ],
  "properties": {
    "chars": {
      "type": "array",
      "items": {
        "type": "string",
        "maxLength": 1,
        "minLength": 1
      }
    },
    "nested_numbers": {
      "type": "array",
      "items": {
        "type": "array",
        "items": {
          "type": "integer",
          "format": "uint32",
          "minimum": 0.0
        }
      }
    },
    "nums": {
      "$ref": "#/definitions/ZeroVec<uint32>"
    },
    "strs": {
      "type": "array",
      "items": {
        "type": "string"
      }
    }
  }
}

Which one should we go for?

ashu26jha avatar Apr 13 '24 12:04 ashu26jha

Update the JSON, if all looks good, I think we are in position to write for ZeroMap & ZeroMap2d as well

ashu26jha avatar Apr 16 '24 04:04 ashu26jha