json-avro-converter
json-avro-converter copied to clipboard
field [fieldName] is expected to be one of these: RECORD, NULL, for nested record with non defined nullable values
I got: Could not evaluate union, field [fieldName] is expected to be one of these: RECORD, NULL. If this is a complex type, check if offending field: trafficSource.adwordsClickInfo adheres to schema. when I have nested records, where some of the 'nullable' fields are not specified.
schema sample:
{
"type": "record",
"name": "Root",
"fields": [
{
"name": "field1",
"type": [
"long",
"null"
]
},
{
"name": "nestedRecord",
"type": [
{
"type": "record",
"namespace": "root",
"name": "NestedRecord",
"fields": [
{
"name": "nested1",
"type": [
"long",
"null"
]
},
{
"name": "nested2",
"type": [
"long",
"null"
]
}
]
},
"null"
]
}
]
}
and json string such as:
{
"field1" : 10999859003,
"nestedRecord":
{
"nested1" : 123321321
}
}
I think when it goes in recursion it is not able to skip missing values because for those missing value at level 0 it skips the missing values.
Thank you
Hey @gadaldo, In this case, the error json-avro-convert is throwing is correct because nested2 is not defined in your sample json and the schema provides no default for it. Avro should only accept this datum in either of those cases and I've verified it does in both with this code:
(using a default value)
def 'should convert nested nullable records'() {
given:
def schema = '''
{
"type": "record",
"name": "Root",
"fields": [
{
"name": "field1",
"type": [
"long",
"null"
]
},
{
"name": "nestedRecord",
"type": [
{
"type": "record",
"namespace": "root",
"name": "NestedRecord",
"fields": [
{
"name": "nested1",
"type": [
"long",
"null"
]
},
{
"name": "nested2",
"type": [
"long",
"null"
], "default": 42
}
]
},
"null"
]
}
]
}
'''
def json = '''
{
"field1" : 10999859003,
"nestedRecord":
{
"nested1" : 123321321, "nested2":42
}
}
'''
when:
def result = converter.convertToJson(converter.convertToAvro(json.bytes, schema), schema)
then:
toMap(result) == toMap(json)
}```
**(using a provided value)**
```groovy
def 'should convert nested nullable records2'() {
given:
def schema = '''
{
"type": "record",
"name": "Root",
"fields": [
{
"name": "field1",
"type": [
"long",
"null"
]
},
{
"name": "nestedRecord",
"type": [
{
"type": "record",
"namespace": "root",
"name": "NestedRecord",
"fields": [
{
"name": "nested1",
"type": [
"long",
"null"
]
},
{
"name": "nested2",
"type": [
"long",
"null"
]
}
]
},
"null"
]
}
]
}
'''
def json = '''
{
"field1" : 10999859003,
"nestedRecord":
{
"nested1" : 123321321, "nested2":43
}
}
'''
when:
def result = converter.convertToJson(converter.convertToAvro(json.bytes, schema), schema)
then:
toMap(result) == toMap(json)
}```
At level 0 of the tree it does, the problem is just when the algorithm goes in recursion. Anyway, I created my own version because I needed it, that JSON comes from TableRow object when reading from Bigquery (with BigQueryIO) and I have to transform in AVRO. It'a a feature that Google does behind the scene but they don't want to expose the API seen they do a further middle transformation to proto as documented here. So based on this algorithm I created mine but I don't know if you can close the issue. Thank you anyway
is this issue got fixed? Please let me know how to resolve it