connect icon indicating copy to clipboard operation
connect copied to clipboard

Switch goavro to fully support logical types

Open pixie79 opened this issue 1 year ago • 1 comments

The use of logical types is becoming more and more common with AVRO especially when validating and sharing data between partners. AVRO should be able to support UUID, Dates and Timestamps by default for conversions. The LinkedIn goavro which connect relies upon has not supported this for a long while and seems so not be progressing into solving this issue. May be it is time to consider options?

My use case is simple around the following types of fields:

                "fields": [
                    {
                        "name": "title",
                        "type": [
                            "null",
                            "string"
                        ],
                        "doc": "ENUM - Title for the party. DOCT - Doctor, MIST - Mr, MISS - Miss, MADM - Madame",
                        "default": null
                    },
                    {
                        "name": "dateOfBirth",
                        "type": [
                            "null",
                            {
                                "type": "int",
                                "logicalType": "date"
                            }
                        ],
                        "doc": "Date party was born",
                        "default": null
                    },
                    {
                        "name": "timestamp",
                        "type": {
                            "type": "long",
                            "logicalType": "timestamp-millis"
                        },
                        "doc": "The service timestamp value associated with the commit point in the platform database giving rise to the INSERT operation recorded in 'milliseconds since epoch' format"
                    }

Along with support for ENUMs.

root.payload.dateOfBirth = root.payload.dateOfBirth.ts_strptime("%Y-%m-%d").ts_unix()

Example error:

level=info msg="cannot decode textual record \"com.example.demo.event.v1.demoCustomerEvent\": cannot decode textual record \"com.example.demo.event.v1.payload\": cannot decode textual union: expected: '{'; actual: '\"' for key: \"title\" for key: \"payload\"" @service=benthos label="" path=root.pipeline.processors.4.catch.0

pixie79 avatar Jul 22 '24 17:07 pixie79

Thanks for raising this @pixie79! The alternative library that I'm aware of is github.com/hamba/avro/v2 and Confluent offers a wrapper for it which may be worth looking at: https://github.com/confluentinc/confluent-kafka-go/tree/master/schemaregistry/serde/avrov2

mihaitodor avatar Jul 23 '24 22:07 mihaitodor