tap-mongodb
tap-mongodb copied to clipboard
Writes broken schemas in 2.0.0
Switching to 2.0.0
I found that this tap outputs bad schema messages:
Original schema:
{
"properties": {
"_id": {
"type": "string"
},
"student_id": {
"type": "integer"
},
"class_id": {
"type": "integer"
},
"scores": {
"items": {
"properties": {
"type": {
"type": "string"
},
"score": {
"type": "number"
}
},
"type": "object"
},
"type": "array"
}
},
"type": "object",
"additionalProperties": true
}
Schema output by tap:
{
"type": "object",
"properties": {
"scores": {
"anyOf": [
{
"type": "array",
"items": {
"anyOf": [
{
"type": "object",
"properties": {
"score": {
"anyOf": [{"type": "number"}, {}]
}
}
},
{}
]
}
},
{}
]
}
}
}
All my records have the same schema, and they look like this:
{
"_id": "50b59cd75bed76f46522c34e",
"student_id": 0,
"class_id": 2,
"scores": [
{
"type": "exam",
"score": 57.92947112575566
},
{
"type": "quiz",
"score": 21.24542588206755
},
{
"type": "homework",
"score": 68.1956781058743
},
{
"type": "homework",
"score": 67.95019716560351
},
{
"type": "homework",
"score": 18.81037253352722
}
]
}
Hi @edgarrmondragon -- the tap-mongodb schema generation was introduced in #40, and since MongoDB is NoSQL, it does not attempt to write a strict schema. Instead, it only writes a schema for date-times
, decimals
, and numbers
as it sees them
This is intended to overcome problems that users were experiencing in certain targets where date-times
were being written as strings, and doubles
/decimals
were sometimes being split into different columns depending on the precision of the value.
Also, I checked https://www.jsonschemavalidator.net/, and the schema that was generated looks like it validates against your record.