circus-train icon indicating copy to clipboard operation
circus-train copied to clipboard

Circus Train should also sync the external schema for Avro Tables when replication mode is METADATA_UPDATE

Open abhimanyugupta07 opened this issue 5 years ago • 1 comments

As a user of CT,

I want that the Circus Train should also sync the external schema for an Avro Table when the replication mode is either METADATA_UPDATE or METADATA_MIRROR.

Context At the moment, when the replication mode is METADATA_UPDATE or METADATA_MIRROR, the table location is not needed to be provided in the config file which means that the null check on https://github.com/HotelsDotCom/circus-train/blob/master/circus-train-avro/src/main/java/com/hotels/bdp/circustrain/avro/transformation/AbstractAvroSerDeTransformation.java#L50 returns empty and the Avro Transform is not triggered which in turn will not copy the schema file to replica.

This was discovered while working on: #131

Related PR: https://github.com/HotelsDotCom/circus-train/pull/141

Possible Solution CT can detect if the table location is not provided and in the Avro transform, it can get the table location from the target HMS.

abhimanyugupta07 avatar Sep 09 '19 15:09 abhimanyugupta07

This is not needed for METADATA_MIRROR as that is an exact metadata copy of the table so all locations will be the same

patduin avatar Sep 09 '19 19:09 patduin