spark-json-schema
spark-json-schema copied to clipboard
Script failing while generating schema from nested Json Schema format
Hey guys,
I am trying to generate spark json schema for below format- { "$schema": "http://json-schema.org/draft-04/schema#", "type": "object", "properties": { "maintenanceWorkOrder": { "type": "object", "properties": { "workOrderNumber": { "type": "string" }, "workOrderActivity": { "type": "array", "items": [ { "type": "object", "properties": { "workOrderNumber": { "type": "string" }, "organization": { "type": "string" } }, "required": [ "workOrderNumber", "organization" ] } ] } }, "required": [ "workOrderNumber", "workOrderActivity" ] } }, "required": [ "maintenanceWorkOrder" ] }
it looks very simple JSON schema, but gives me below error - scala> val schema = SchemaConverter.convertContent(json_wo_lines) play.api.libs.json.JsResultException: JsResultException(errors:List((,List(ValidationError(List(error.expected.jsobject),WrappedArray()))))) at play.api.libs.json.JsReadable$$anonfun$2.apply(JsReadable.scala:23) at play.api.libs.json.JsReadable$$anonfun$2.apply(JsReadable.scala:23) at play.api.libs.json.JsResult$class.fold(JsResult.scala:73) at play.api.libs.json.JsError.fold(JsResult.scala:13) at play.api.libs.json.JsReadable$class.as(JsReadable.scala:21) at play.api.libs.json.JsDefined.as(JsLookup.scala:132) at org.zalando.spark.jsonschema.SchemaConverter$.getFieldType(SchemaConverter.scala:184) at org.zalando.spark.jsonschema.SchemaConverter$.addJsonField(SchemaConverter.scala:171) at org.zalando.spark.jsonschema.SchemaConverter$.convertJsonStruct(SchemaConverter.scala:134) at org.zalando.spark.jsonschema.SchemaConverter$.getDataType(SchemaConverter.scala:203) at org.zalando.spark.jsonschema.SchemaConverter$.getFieldType(SchemaConverter.scala:190) at org.zalando.spark.jsonschema.SchemaConverter$.addJsonField(SchemaConverter.scala:171) at org.zalando.spark.jsonschema.SchemaConverter$.convertJsonStruct(SchemaConverter.scala:134) at org.zalando.spark.jsonschema.SchemaConverter$.convert(SchemaConverter.scala:74) at org.zalando.spark.jsonschema.SchemaConverter$.convertContent(SchemaConverter.scala:60) ... 49 elided
Let me know if someone has faced similar issue and have resolution.
Thanks,
Hi @3mlabs and thanks for reporting this.
The issue is with defining the workOrderActivity
array field.
Arrays in JSON Schema models two distinct concepts lists
and tuples
. Our library supports list-validated arrays, but we do not currently support tuple-validated arrays. (more details on the difference here)
If your intention is to make workOrderActivity
an array of coherent objects like the example below, you can simply change the definition for workOrderActivity.items
from an array of objects to a single object.
If this meets your requirement:
{
"maintenanceWorkOrder": {
"workOrderNumber": "123",
"workOrderActivity": [
{
"workOrderNumber": "order_001",
"organization": "org_A"
},
{
"workOrderNumber": "order_002",
"organization": "org_B"
}
]
}
}
This definition would be more correct (and would work with our library):
{
"$schema": "http://json-schema.org/draft-04/schema#",
"type": "object",
"properties": {
"maintenanceWorkOrder": {
"type": "object",
"properties": {
"workOrderNumber": {
"type": "string"
},
"workOrderActivity": {
"type": "array",
"items": {
"type": "object",
"properties": {
"workOrderNumber": {
"type": "string"
},
"organization": {
"type": "string"
}
},
"required": [
"workOrderNumber",
"organization"
]
}
}
},
"required": [
"workOrderNumber",
"workOrderActivity"
]
}
},
"required": [
"maintenanceWorkOrder"
]
}
I hope this solves your issue!