openschema
openschema copied to clipboard
[SECOND ROUND]Review Discussion
I have this issue opened and would like to discuss some analysis regarding OpenSchema specification when comparing with CloudEvents Schema Registry API.
Overall speaking, both projects target the same domain of data schema which provides vendor-neutral specifications for storing, accessing, managing schema documents such as JSON Schema and Apache Avro types.
However, after a thorough comparison of both specifications, there are some noticeable differences where the details are explained below along with some pros and cons from the analysis against both specifications . :)
As the following indicates, the three layers of path hierarchy of CloudEvents Schema Registry (Left) is different from the two layers of OpenSchema (Right) which does not have a SchemaGroup hierarchy. However, the concept of Schema vs Subject and Schema Versions from both specs are equivalent.

Then following are comparisons against both specifications
| CloudEvents SchemaRegistry | OpenSchema |
|---|---|
| Lightweight and contains fundamental attributes such as id, version, format as a schema spec | Rich spec and covers pretty much all attributes from CloudEvents SchemaRegistry along with tenant, status and compatibility which support better lifecycle management as a Schema Registry |
| SchemaGroup hierarchy for managing groups of schema | No SchemaGroup hierarchy . Grouping of schemas can be managed by namespace or tenant attribute of subject hierarchy |
| schemagroup has to be part of REST API Url. e.x /schemagroups/5/schemas/1/version/3 | N/A. e.x. /subjects/1/version/3 |
| id attribute identifies every path hierarchy | subject attribute identifies top hierarchy, id identifies schema version |
| Replication Model | Consider to add transformer from source to target |
CloudEvents SchemaRegistry Pros:
- Due to lightweight nature, spec itself is easy to understand/implement.
- Schema hierarchy has authority attribute
CloudEvents SchemaRegistry Cons:
- Three Path hierarchy concept (SchemaGroup/Schema/Version) is a bit overkill. Especially, Longer REST API Url has to be evolved to reflect the three path hierarchy for CRUD against schema registry (e.x. /schemagroups/5/schemas/1/version/3)
- id attribute in each hierarchy layer may be confusing at first glance. For example, id attribute in SchemaGroup and Schema
- Schema Version hierarchy does not clearly define the attribute that holds the actual "Body" of schema document.
- id in Schema hierarchy actually stands for a unique name(user defines). However, id in SchemaVersion is more about an UUID such as a reference id in DB. So both ids create some confusion at first glance
- Both SchemaGroup hierarchy and Schema hierarchy have pretty much the same set of attributes (id, format, description, createdtime,updatedtime) which look like SchemaGroup hierarchy is a bit redundant
OpenSchema Pros:
- Rich spec, more like a superset of CloudEvents SchemaRegistry.
- Spec itself considers multi-tenancy, schema compatibility support, schema status which address a more completed lifecycle management of a schema registry.
- A simple , two layers of Path hierarchy supports grouping management of schemas along with a shorter REST API Urls requirement. For example, /subjects/1/version/3
OpenSchema Cons:
- Rich spec requires better documentations for describing use cases/examples of how each part of the specs/attributes works
- createdtime, updatedtime are missing from spec which can be raised to community for an update.
As a quick reference, the following image simply illustrates both specs along with its attributes for each hierarchy
