spring-cloud-schema-registry AvroSchemaRegistryClientMessageConverter does not allow versionedSchema regex to be overridden

AvroSchemaRegistryClientMessageConverter does not allow versionedSchema regex to be overridden

Open palmski opened this issue 5 years ago • 0 comments

Describe the issue The regex for versionedSchema in AvroSchemaRegistryClientMessageConverter has an expectation that the subject part of the schema is alphanumeric and does not allow for the fact that the Confluent schema registry is case sensitive, whereas MimeType converts to lowercase.

We have implemented a custom org.springframework.cloud.stream.schema.avro.SubjectNamingStrategy which is driven by enterprise requirements to have a certain prefix to the registered schema name, which contains non-alphanumeric characters and some mixed case formatting.

For example a schema named by default as "foobar" would be registered as SharedKafka_1234.foobar-value. When converted to a MimeType this is application/vnd.sharedkafka_1234.foobar-value.v4+avro which fails the regex check in AvroSchemaRegistryClientMessageConverter here:

private SchemaReference extractSchemaReference(MimeType mimeType) {
		SchemaReference schemaReference = null;
		Matcher schemaMatcher = this.versionedSchema.matcher(mimeType.toString());
		if (schemaMatcher.find()) {
			String subject = schemaMatcher.group(1);
			Integer version = Integer.parseInt(schemaMatcher.group(2));
			schemaReference = new SchemaReference(subject, version, AVRO_FORMAT);
		}
		return schemaReference;
	}

meaning the schema version is never extracted. Furthermore even if the schema reference can be extracted, it would subsequently fail a lookup in the schemaRegistryClient due to the case sensitivity issue.

This prevents us evolving our schemas, as the local schema is then used, which is incompatible with the incoming message

To Reproduce

Register a schema version "n" with a custom SubjectNamingStrategy which includes a non-alphanumeric character
Evolve the schema to "n+1" by adding an optional field
Produce a message using a custom SubjectNamingStrategy which includes a non-alphanumeric character with schema version "n+1"
Attempt to consume the message with a consumer using schema version "n"
Observe stacktrace, similar to

Caused by: java.lang.ArrayIndexOutOfBoundsException: 24
	at org.apache.avro.io.parsing.Symbol$Alternative.getSymbol(Symbol.java:460) ~[avro-1.9.0.jar:1.9.0]
	at org.apache.avro.io.ResolvingDecoder.readIndex(ResolvingDecoder.java:283) ~[avro-1.9.0.jar:1.9.0]
	at org.apache.avro.generic.GenericDatumReader.readWithoutConversion(GenericDatumReader.java:178) ~[avro-1.9.0.jar:1.9.0]
	at org.apache.avro.specific.SpecificDatumReader.readField(SpecificDatumReader.java:136) ~[avro-1.9.0.jar:1.9.0]
	at org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:237) ~[avro-1.9.0.jar:1.9.0]
	at org.apache.avro.specific.SpecificDatumReader.readRecord(SpecificDatumReader.java:123) ~[avro-1.9.0.jar:1.9.0]
	at org.apache.avro.generic.GenericDatumReader.readWithoutConversion(GenericDatumReader.java:170) ~[avro-1.9.0.jar:1.9.0]
	at org.apache.avro.specific.SpecificDatumReader.readField(SpecificDatumReader.java:136) ~[avro-1.9.0.jar:1.9.0]
	at org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:237) ~[avro-1.9.0.jar:1.9.0]
	at org.apache.avro.specific.SpecificDatumReader.readRecord(SpecificDatumReader.java:123) ~[avro-1.9.0.jar:1.9.0]
	at org.apache.avro.generic.GenericDatumReader.readWithoutConversion(GenericDatumReader.java:170) ~[avro-1.9.0.jar:1.9.0]
	at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:151) ~[avro-1.9.0.jar:1.9.0]
	at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:144) ~[avro-1.9.0.jar:1.9.0]
	at org.springframework.cloud.stream.schema.avro.AbstractAvroMessageConverter.convertFromInternal(AbstractAvroMessageConverter.java:105) ~[spring-cloud-stream-schema-2.1.3.RELEASE.jar:2.1.3.RELEASE]

Version of the framework 2.1.3-RELEASE Expected behavior The regex is overrideable, and the schema registry client takes case into account case-sensitivity (due to MimeType restrictions and Confluent's case sensitivity)

Screenshots

Additional context Add any other context about the problem here.

Mar 31 '20 11:03 palmski

spring-cloud-schema-registry spring-cloud-schema-registry copied to clipboard

AvroSchemaRegistryClientMessageConverter does not allow versionedSchema regex to be overridden

spring-cloud-schema-registry
spring-cloud-schema-registry copied to clipboard