Manatee.Json icon indicating copy to clipboard operation
Manatee.Json copied to clipboard

Generate JSON-Schema

Open mikemol opened this issue 7 years ago • 20 comments

Does this support generating a JSON-Schema by inspecting an object and extracting its attributes as properties within a schema?

I.e. could I take an object, emit both a JSON representation of the object's value and a JSON-Schema representation of the object's structure, and use one to validate the other?

mikemol avatar Dec 08 '17 20:12 mikemol

At one point I had tried this, but I was having trouble with some types, like dictionaries. I'm open to exploring this again. It may be a while as I'm working over at the schema spec repo and some serialization performance enhancements.

gregsdennis avatar Dec 09 '17 00:12 gregsdennis

@mikemol what features do you need for this? For example, do you expect it to place value restrictions (like min/max) on the properties?

I'm thinking that the client (you) could place attributes from the System.ComponentModel.DataAnnotations namespace. I already use DisplayAttribute for enum serialization, so I don't have to add a reference. The available validations are:

  • MinLength
  • MaxLength
  • Range
  • Required
  • StringLength (duplicates/combines MinLength & MaxLength)
  • EmailAddress
  • RegularExpression
  • Url

There isn't a built-in way to specify a schema for items in a list or dictionary, though.

Thoughts?


Also, my guess is that you then want to deserialize these into your objects, correct? If that's the case, you'll need to be sure the serializer is configured properly (which properties to serialize, etc.). For instance, a schema can validate a read-only property, but the serializer wouldn't be able to deserialize it (because it's read-only).

gregsdennis avatar Dec 14 '17 03:12 gregsdennis

These attributes don't cover all of the possibilities for schema. For example, there's no way to specify that a minimum is exclusive vs. inclusive.

May have to create a set of attributes for this.

gregsdennis avatar Dec 14 '17 11:12 gregsdennis

Just thinking out loud:

An argument against this feature might be that the schema types were designed to be easily created in code.

The response to this is that it creates duplication and maintenance of the model and the requirements in a multiple places. Whereas if I place attributes directly on my model and then generate the schema from that, I have all of my validation and model in one place.

gregsdennis avatar Dec 14 '17 11:12 gregsdennis

We use that approach in our Web API's via Swagger.Net. We generate a Swagger definition via convention+attributes and then use that to generate compatible clients.

Likewise if we had a convention+attribute model to generate schemas we could: (a) rapidly create schemas for existing models, and (b) extend our web services to publish their REST model schemas.

sixlettervariables avatar Dec 14 '17 18:12 sixlettervariables

I think I'm going to need to have the client provide a serializer for this. There are mechanics that the serializer is already set up to address:

  • property name transforms
  • enumeration serialization (both number and named, and considering the Display attribute)
  • property selection (read-only vs read/write vs write-only)

gregsdennis avatar Dec 15 '17 02:12 gregsdennis

See branch feature/issue-121-schema-generation

@sixlettervariables & @mikemol, please see SchemaGenerationTarget for an example of usage of things currently supported. The test that uses this is in the GenerationTest class.

I'd like your input on the public-facing side to see if this approach will support your needs.

gregsdennis avatar Dec 15 '17 07:12 gregsdennis

I'll give it a go. Might take a bit. Had a hardware failure last night...

mikemol avatar Dec 15 '17 11:12 mikemol

I have made some additional improvements. I'm working on extracting common schemas into the root definitions collection.

gregsdennis avatar Dec 23 '17 04:12 gregsdennis

@mikemol I have a question about your preference for the output schema. The way I see it there are a couple options:

Repetition

Schemas for properties are simply defined at that property. If there is a property that has exactly the same type and restrictions, the schema is simply repeated for that property.

{
  "properties" : {
    "int1": { "type" : "integer", "maximum": 20 },
    "int2": { "type" : "integer", "maximum": 20 }
  }
}

Consolidation

The final schema is scanned for duplicates. The duplicates are then collected to be stored in the root's definitions collection and any usages are replaced by $refs.

{
  "definitions" : {
    "integer_max_20" : { "type" : "integer", "maximum": 20 }
  },
  "properties" : {
    "int1": { "$ref" : "#/definitions/integer_max_20" },
    "int2": { "$ref" : "#/definitions/integer_max_20" },
    "int3": { "type" : "integer", "minimum" : 10 }
  }
}

gregsdennis avatar Dec 26 '17 06:12 gregsdennis

I've been using the branch in my test setup, and had to make one change for it to work with our performance enhancements:

@@ -126,1 126,1 @@ Manatee.Json/Schema/Generation/SchemaGenerator.cs
-								  .Select(v => new EnumSchemaValue(((JsonValue)serializerMethod.Invoke(serializer, new[]{v})).String))
+								  .Select(v => new EnumSchemaValue(((JsonValue)serializerMethod.Invoke(serializer, v)).String))

It appears to work well for my needs (fairly simple).

sixlettervariables avatar Dec 29 '17 16:12 sixlettervariables

I'd also enjoy a JsonSchemaFactory method to call GenerateFor using the default schema format (although not strictly required).

My usage of JsonSchema in our app's event system would be as follows:

  • Distributed Events have an ID and a JSON Payload (dictated by a schema)
  • Distributed Events are either built-in or ad-hoc
    • Built-in Events are defined with a well-known type
      • When a built-in event is registered, its JSON schema is generated from the type and stored
    • Ad-hoc Events are defined by an ID and JSON Schema combo
  • When events are received their JSON payload is checked against their schema
    • Event receivers can indicate if they want only good events, good events plus failures, or all events (with "bad" payloads)

I'm not savvy enough yet with JSON Schema, but eventually I'd need to add support for forwards/backwards compatibility in the automatically generated schemas.

sixlettervariables avatar Dec 29 '17 16:12 sixlettervariables

@sixlettervariables I'm building it with draft 7 for now. I'll update to support all drafts once I nail down the logic.

I'm not sure what you mean by "events" in this context. Can you elaborate?

gregsdennis avatar Dec 29 '17 21:12 gregsdennis

@gregsdennis just an example of how we're using JsonSchema in our application (i.e. part of a distributed events system).

sixlettervariables avatar Dec 30 '17 01:12 sixlettervariables

@mikemol any thoughts or preferences on the repitition/consolidation approaches I mentioned above?

gregsdennis avatar Jan 07 '18 20:01 gregsdennis

@mikemol @sixlettervariables any additional thoughts on this? It's a good feature. I don't want to see it stall out, but I think we need to nail down the desired behavior.

gregsdennis avatar Mar 23 '18 10:03 gregsdennis

Okay.... It's been a long time, but I think I need to get this done. I've also since rebuilt the schema implementation, and I failed to keep this branch up-to-date. That means that I have a lot of work to redo. 😢

gregsdennis avatar Sep 26 '18 03:09 gregsdennis

Okay.... It's been a long time, but I think I need to get this done. I've also since rebuilt the schema implementation, and I failed to keep this branch up-to-date. That means that I have a lot of work to redo. 😢

Does the current library support generating json schema from c# object?

fgajtanovski avatar May 18 '19 07:05 fgajtanovski

Sadly no. I never got the chance to rewrite this. Life happens, you know. If there's enough interest, I might be able to find some time.

gregsdennis avatar May 19 '19 22:05 gregsdennis

I would have loved this feature for a current project, but had to resort to doing schemas manually and just manipulating them with this lib. One thing that I think could be really beneficial is that if you load your C# models with data in them and then generate the schema, it will actually take that into account and add for example oneOf/anyOf or similar for the data that is loaded, and it could be specified with an attribute.

B1nke avatar Apr 08 '20 17:04 B1nke