registry
registry copied to clipboard
Feature request: allow us to reference schemas within other schemas
This would be enormously helpful. Let's say I have many types of transactions that involve users, and a user looks like this:
{
"namespace": "example.avro",
"type": "record",
"name": "user",
"fields": [
{
"name": "name",
"type": "string"
},
{
"name": "id",
"type": "int"
}
]
}
And I have many schemas that involve users as a category. For example:
{
"namespace": "example.avro",
"type": "record",
"name": "UserFriends",
"fields": [
{
"name": "user",
"type": {
"namespace": "example.avro",
"type": "record",
"name": "user",
"fields": [
{
"name": "name",
"type": "string"
},
{
"name": "id",
"type": "int"
}
]
} },
{
"name": "friends",
"type": {"type": "array", "items": {"type": "example.avro.user"}}
}
]
}
And:
{
"namespace": "example.avro",
"type": "record",
"name": "Purchase",
"fields": [
{
"name": "customer",
"type": {
"namespace": "example.avro",
"type": "record",
"name": "user",
"fields": [
{
"name": "name",
"type": "string"
},
{
"name": "id",
"type": "int"
}
]
} },
{
"name": "productId",
"type": "int"
}
]
}
It would be wonderful if I could create a distinct schema registry entry for the user schema, and just reference that schema (and the version) in other schemas. It would be even cooler to reference the latest version of a schema.
But even just referencing a schema ID and a version number would be a big help. As it is, it's quite complicated to keep track of this kind of situation. When I update user by adding a field, I have to update all the schemas that have users.
@dmsolow Including existing schemas is supported with includeSchemas attribute in schemas. I hope it solves the scenario you mentioned in earlier comment.
@satishd I thought it might solve my scenario, but I'm confused as to when the referenced schemas are resolved.
I made two test schemas, with schema A referencing the other schema B using includeSchemas
key. However when I retrieve schema A from the registry, the returned schema does not contain the type definition that was referenced from schema B. Instead it includes the includeSchemas
key.
I guess I figured that the schema registry would resolve the references by itself? What am I missing here?
@dmsolow Existing SchemaResgitryClient
APIs does not return resultant schemas.. Avro deserializer internally resolves schemas with the mentioned attribute here. You need to do something similar to build resultant schemas using AvroSchemaResolver#resolveShema.
@satishd Okay, that's unfortunate. From my perspective an endpoint that returned the fully resolved schema would be very useful. I'll consider my options from here. I can probably still use the schema-registry, but it will require a bit of legwork since I'm not really planning to use java for serializers/deserializers.
@satishd I'm finding this feature to be kind of lacking in 0.5.4
It would be nice, for example, to have one "utils" schema that's a union of record types, and then include the "utils" schema in another schema, and reference one or more of the record types. However this does not work currently.
I'm also having trouble adding a new version of a schema that takes advantage of this feature, it seems fairly broken all around. For example if I create a schema that references one type form another schema, and then try to add a new version that references another type from the same schema, it fails