cloudflow icon indicating copy to clipboard operation
cloudflow copied to clipboard

Avro backward compatibility

Open unit7-0 opened this issue 3 years ago • 0 comments

Is your feature request related to a problem? Please describe. The question is related to backward compatibility support when changing avro schemas.

Is your feature request related to a specific runtime of cloudflow or applicable for all runtimes? All runtimes.

Describe the solution you'd like I faced that the cloudflow serialization/deserialization engine does not support backward compatibility for avro messages. I did a some research on the code and found that when creating the codec, only one schema is used, which is presented along with the generated class during compilation(https://github.com/lightbend/cloudflow/blob/master/core/cloudflow-streamlets/src/main/scala/cloudflow/streamlets/avro/AvroCodec.scala#L30). However to maintain backward compatibility in the java avro serialization mechanism, it is necessary to pass two schemas to the SpecificDatumReader - one schema for the message writer, other for the reader(https://github.com/apache/avro/blob/master/lang/java/avro/src/main/java/org/apache/avro/specific/SpecificDatumReader.java#L49). In other words, for now to make changes in the schema, it is necessary for the consumer streamlet to read all the messages in the topic, and only then we can update deployed pipeline version. Are there any plans to support backward compatibility for avro messages in the future releases? Maybe the task overlaps with the plans for introducing the schema registry and this will be implemented within that feature? (#1010, #996)

For now I get an error while deserializing with new schema(compatible) message written in old schema:

Caused by: com.twitter.bijection.InversionFailure: Failed to invert: [B@46271dd6
	at com.twitter.bijection.InversionFailure$$anonfun$partialFailure$1.applyOrElse(InversionFailure.scala:43)
	at com.twitter.bijection.InversionFailure$$anonfun$partialFailure$1.applyOrElse(InversionFailure.scala:42)
	at scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:38)
	at scala.util.Failure.recoverWith(Try.scala:236)
	at com.twitter.bijection.Inversion$.attempt(Inversion.scala:32)
	at com.twitter.bijection.avro.BinaryAvroCodec.invert(AvroCodecs.scala:330)
	at com.twitter.bijection.avro.BinaryAvroCodec.invert(AvroCodecs.scala:320)
	at cloudflow.streamlets.avro.AvroSerde.$anonfun$inverted$1(AvroCodec.scala:39)
	at cloudflow.streamlets.avro.AvroSerde.$anonfun$decode$1(AvroCodec.scala:45)
	at scala.util.Try$.apply(Try.scala:213)
	at cloudflow.streamlets.avro.AvroSerde.decode(AvroCodec.scala:45)
	at cloudflow.streamlets.avro.AvroCodec.decode(AvroCodec.scala:34)

unit7-0 avatar May 19 '21 15:05 unit7-0