registry icon indicating copy to clipboard operation
registry copied to clipboard

SchemaGroup and enforcing uniqueness with in the group or across the SchemaRegistry

Open harshach opened this issue 8 years ago • 4 comments

Lets say, If I want to register a schemaName with person with following schema for Nifi { "name": "person", "namespace": "nifi", "type": "record", "fields": [ { "name": "id", "type": "string" }, { "name": "firstName", "type": "string", "aliases": [ "first_name" ] }, { "name": "lastName", "type": "string", "aliases": [ "last_name" ] }, { "name": "email", "type": "string" }, { "name": "gender", "type": "string" }, { "name": "ipAddress", "type": "string", "aliases": [ "ip_address" ] } ] } Should we allow users to use the same schemaName under a different group. If I want to use schemaName "person" under schemaGroup "kafka" { "name": "person", "namespace": "nifi", "type": "record", "fields": [ { "name": "id", "type": "string" }, { "name": "firstName", "type": "string", "aliases": [ "first_name" ] }, { "name": "lastName", "type": "string", "aliases": [ "last_name" ] }, { "name": "email", "type": "string" }, { "name": "gender", "type": "string" }, } The above request comes back as success but I don't see new schema getting registered. cc @satishd

harshach avatar May 03 '17 20:05 harshach

Currently, schema name(in schema metadata) should be unique across the registry irrespective of the group. When a schema metadata with the given schema name is already given then it returns the earlier registered schema metadata. When you try to add a schema with the existing schema metadata then it will be added as new version of the existing schema and it returns the schema version id.

Adding an API as part of #77 to throw an error when you try to register new schema metadata with the existing name.

satishd avatar May 04 '17 05:05 satishd

Having uniqueness with schema name and group gives separation of schemas at group level. We have started with that separation but I guess it was decided users can have naming convention like "group.name" as part of schema metadata name and name can be unique in a registry cluster.

+1 with the current abstraction ~~on having separation~~. I would like to hear opinions on whether there are any usecases in which this abstraction is not that appropriate for them.

satishd avatar May 04 '17 05:05 satishd

@michaelandrepearce any thoughts on this matter

harshach avatar Jun 22 '17 04:06 harshach

+1 One main point is that the numerical ids generated and used in serialised data must remain globally unique (as these globally identify the schema once registered irrespective of group or name).

Having the schema name uniqueness I think this is sensible, as essentially allows you to have different sub-units within an organisation to become self managing without the need for either setting up seperate repos or having each sub-unit having collisions.

michaelandrepearce avatar Jun 22 '17 07:06 michaelandrepearce