spark-json-schema icon indicating copy to clipboard operation
spark-json-schema copied to clipboard

Script failing while generating schema from nested Json Schema format

Open 3mlabs opened this issue 4 years ago • 1 comments

Hey guys,

I am trying to generate spark json schema for below format- { "$schema": "http://json-schema.org/draft-04/schema#", "type": "object", "properties": { "maintenanceWorkOrder": { "type": "object", "properties": { "workOrderNumber": { "type": "string" }, "workOrderActivity": { "type": "array", "items": [ { "type": "object", "properties": { "workOrderNumber": { "type": "string" }, "organization": { "type": "string" } }, "required": [ "workOrderNumber", "organization" ] } ] } }, "required": [ "workOrderNumber", "workOrderActivity" ] } }, "required": [ "maintenanceWorkOrder" ] }

it looks very simple JSON schema, but gives me below error - scala> val schema = SchemaConverter.convertContent(json_wo_lines) play.api.libs.json.JsResultException: JsResultException(errors:List((,List(ValidationError(List(error.expected.jsobject),WrappedArray()))))) at play.api.libs.json.JsReadable$$anonfun$2.apply(JsReadable.scala:23) at play.api.libs.json.JsReadable$$anonfun$2.apply(JsReadable.scala:23) at play.api.libs.json.JsResult$class.fold(JsResult.scala:73) at play.api.libs.json.JsError.fold(JsResult.scala:13) at play.api.libs.json.JsReadable$class.as(JsReadable.scala:21) at play.api.libs.json.JsDefined.as(JsLookup.scala:132) at org.zalando.spark.jsonschema.SchemaConverter$.getFieldType(SchemaConverter.scala:184) at org.zalando.spark.jsonschema.SchemaConverter$.addJsonField(SchemaConverter.scala:171) at org.zalando.spark.jsonschema.SchemaConverter$.convertJsonStruct(SchemaConverter.scala:134) at org.zalando.spark.jsonschema.SchemaConverter$.getDataType(SchemaConverter.scala:203) at org.zalando.spark.jsonschema.SchemaConverter$.getFieldType(SchemaConverter.scala:190) at org.zalando.spark.jsonschema.SchemaConverter$.addJsonField(SchemaConverter.scala:171) at org.zalando.spark.jsonschema.SchemaConverter$.convertJsonStruct(SchemaConverter.scala:134) at org.zalando.spark.jsonschema.SchemaConverter$.convert(SchemaConverter.scala:74) at org.zalando.spark.jsonschema.SchemaConverter$.convertContent(SchemaConverter.scala:60) ... 49 elided

Let me know if someone has faced similar issue and have resolution.

Thanks,

3mlabs avatar Apr 29 '20 18:04 3mlabs

Hi @3mlabs and thanks for reporting this.

The issue is with defining the workOrderActivity array field.

Arrays in JSON Schema models two distinct concepts lists and tuples. Our library supports list-validated arrays, but we do not currently support tuple-validated arrays. (more details on the difference here)

If your intention is to make workOrderActivity an array of coherent objects like the example below, you can simply change the definition for workOrderActivity.items from an array of objects to a single object.

If this meets your requirement:

{
  "maintenanceWorkOrder": {
    "workOrderNumber": "123",
    "workOrderActivity": [
      {
        "workOrderNumber": "order_001",
        "organization": "org_A"
      },
      {
        "workOrderNumber": "order_002",
        "organization": "org_B"
      }
    ]
  }
}

This definition would be more correct (and would work with our library):

{
  "$schema": "http://json-schema.org/draft-04/schema#",
  "type": "object",
  "properties": {
    "maintenanceWorkOrder": {
      "type": "object",
      "properties": {
        "workOrderNumber": {
          "type": "string"
        },
        "workOrderActivity": {
          "type": "array",
          "items": {
            "type": "object",
            "properties": {
              "workOrderNumber": {
                "type": "string"
              },
              "organization": {
                "type": "string"
              }
            },
            "required": [
              "workOrderNumber",
              "organization"
            ]
          }
        }
      },
      "required": [
        "workOrderNumber",
        "workOrderActivity"
      ]
    }
  },
  "required": [
    "maintenanceWorkOrder"
  ]
}

I hope this solves your issue!

mjalajel avatar May 07 '20 14:05 mjalajel