kafka-connect-cosmosdb icon indicating copy to clipboard operation
kafka-connect-cosmosdb copied to clipboard

CosmosDb Sink Connector appends `_0` to key of protobuf `oneOf` object

Open AlexKDawson opened this issue 2 years ago • 4 comments

Description

  • Using the sink connector. Protobuf with a oneOf field causes field to have _0 appended to it's key.

Expected Behavior

  • I would expect the key for the nested object to be the same in cosmos as in the protobuf.

Reproduce

  • Configure the Cosmos Sink Connector with the following:
  config_nonsensitive = {
    "connector.class" : "CosmosDbSink",
    "kafka.auth.mode" : "SERVICE_ACCOUNT",
    "kafka.service.account.id" : var.SA_CONNECTOR_ID,
    "tasks.max" : "1",

    "name" : "CosmosDbSink_namespace-project-tripintegrations-dev",
    "input.data.format" : "PROTOBUF",

    "topics" : "namespace-project-tripintegrations-byvehicle-dev-public",
    "connect.cosmos.databasename" : "Project.Demand",
    "connect.cosmos.containers.topicmap" : "namespace-project-tripintegrations-dev-public#TripStates",
    "cosmos.id.strategy" : "ProvidedInValueStrategy",

    "transforms" : "idTransform",
    "transforms.idTransform.type" : "org.apache.kafka.connect.transforms.ReplaceField$Value",
    "transforms.idTransform.renames" : "TripId:id",
  }
  • Feed data into the source topic with the following protobuf schema

Project/Execution/TripByVehicle.proto

syntax = "proto3";

option csharp_namespace = "Namespace.Project.Protobuf.Execution";
import "google/protobuf/timestamp.proto";
import "Project/CommonMessages.proto";
package Project.Execution;

message TripByVehicle {
  string TripId = 1;
  .google.protobuf.Timestamp LastUpdateUtc = 2;
  ServiceLocation Origin =3;
  ServiceLocation Destination =4;
  .google.protobuf.Timestamp ScheduledDeparture = 5;
  .google.protobuf.Timestamp ScheduledArrival = 6;
}

Project/CommonMessages.proto // Supporting proto imported above

 syntax = "proto3";

  option csharp_namespace = "Namespace.Project.Protobuf";
  package Project;
 
  import "google/protobuf/timestamp.proto";

  message ServiceLocation {
    bool IsLocationCode = 1;
    oneof LocationDetail {
      string LocationCode = 2;
      .Project.Address Address = 3;
    }
  }
  
  message Address {
     string AddressLine1 = 1;
     string AddressLine2 = 2;
     string City = 3;
     string Country = 4;
     string FirstName = 5;
     string LastName = 6;
     string PostalCode = 7;
     string State = 8;
  }
  • Send data into the topic

  • Cosmos will create the LocationDetail field with the key LocationDetail_0 in the container.

...
    "Origin": {
        "LocationDetail_0": {                                 <------------- The issue occurs here
            "LocationCode": "ABCDE",
            "Address": null
        },
        "IsCarvanaLocation": true
    },
    "Destination": {
        "LocationDetail_0": {                                 <------------- And here
            "LocationCode": "FGHIJ",
            "Address": null
        },
        "IsCarvanaLocation": true
    },
 ...

Additional Context

  • If this is an intended behavior due to the use of a oneOf, could you point me toward the documentation please. I was not able to find any information on this happening

AlexKDawson avatar Sep 22 '22 00:09 AlexKDawson

@xinlian12 please take a look.

kushagraThapar avatar Sep 22 '22 21:09 kushagraThapar

Has there been any movement on this? Has this been reproducible?

AlexKDawson avatar Oct 25 '22 18:10 AlexKDawson

@AlexKDawson this issue is in our backlog, plant o pick it up in the next few weeks

xinlian12 avatar Oct 25 '22 19:10 xinlian12

Thanks for the update!

AlexKDawson avatar Oct 25 '22 21:10 AlexKDawson