Convert from avro logical types to .net equivalents (and viceversa)
Description
I'm trying to find out a way to use the avro logical types, behind the scene the avrogen tool creates a byte[] property (for decimals) and couldn't find a way to convert the .net decimals into this.
Can you please help me with this?
How to reproduce
-
Create a schema file, e.g.
User.avsc[ { "namespace": "Confluent.Kafka.Examples.AvroSpecific", "type": "record", "name": "User", "fields": [ { "name": "name", "type": "string" }, { "name": "favorite_number", "type": [ "int", "null" ] }, { "name": "favorite_color", "type": [ "string", "null" ] }, { "name": "birth_date_raw", "type": { "type": "int", "logicalType": "date" } }, { "name": "salary_raw", "type": { "type": "bytes", "logicalType": "decimal", "precision": 19, "scale": 4 } } ] } ] -
Run the avrogen tool in the console.
avrogen.exe -s .\User.asvc . -
See the generated property in the
User.cs(generated class)public byte[] salary_raw { get { return this._salary_raw; } set { this._salary_raw = value; } }
Checklist
Please provide the following information:
- [ ] Confluent.Kafka nuget version:
- [ ] Apache Kafka version:
- [ ] Client configuration:
- [ ] Operating system:
- [ ] Provide logs (with "debug" : "..." as necessary in configuration)
- [ ] Provide broker log excerpts
- [ ] Critical issue
Avro logical types are not supported by the .NET avro library (they are only supported in java AFAIK). I have not tried to do this, but you could perhaps try augmenting the classes generated using avrogen with your own serialization / deserialization in some way. Marking this as enhancement as we should provide guidance on the best way to do this.
I guess that this request should be submitted in the apache avro repository, instead this one given that the classes are generated by the avrogen tool.
I found this fork of the Apache Avro repository that has some initial logicalType support built in that might be able to be cleaned up and used as a starting point.
Ah, that was me trying to put support for logical types in place. Unfortunately, the Avro codebase was quite old and not very friendly (eg, the Apache project was not even using github when I tried, and the repo linked above was just a mirror). Since I was quite in a rush at the time, I simply created an helper method that translates byte[] to decimal according to the original Java specification:
// http://www.ralbu.com/post/2013/07/19/BigInteger-in-Java-and-C // c# The individual bytes in the value array should be in little-endian order, from lowest-order byte to highest-order byte. public static decimal GetDecimalFromByteArray(byte[] bytes) { var result = new BigInteger(bytes.Reverse().ToArray()); return (decimal) result * (decimal)Math.Pow(10, -8); }
The above method just assumes a fixed magnitude, which was the case at the time (avro messages encoded by debezium coming from a mysql/maria db, with decimals all set to (20,8) in the db). The only huge pain for me was that I needed to know which fields were decimals before using them. Whether you use Avro GenericRecords or you use the avro gen tool, you loose in any case information about the logical type which could be used to decide whether the field is a real decimal or not (there is another open issue about this https://github.com/confluentinc/confluent-kafka-dotnet/issues/636). If you are using the avro gen tool, you can probably adapt the code above in some way, but in our case we went for GenericRecords and just used an helper method when we had to read decimals values (really few cases, just a couple of instances). This approach doesn't obviously scale very well and could quickly become a pain to mantain, it would be better to have "native" decimals.
tl;dr You probably shouldn't build on the (failed) attempt above.
In case this might help: where s is the avro encoded decimal, and you know the original scale - using @manuel-zulian 's method:
public static decimal GetDecimalFromBase64String(string s, int scale)
{
var bytes = Convert.FromBase64String(s);
var result = new BigInteger(bytes.Reverse().ToArray());
return (decimal)result * (decimal)Math.Pow(10, -scale);
}
It's ok as a workaround but this actually should be done by the generated class also because there're other logical types (like datetime)
https://avro.apache.org/docs/1.8.0/spec.html#Logical+Types
Any news regarding this issue? I am myself new to Avro and would like to generate a C# model based on a Avro schema .asvc file using the avrogen tool. One of the properties defined in the schema has a 'timestamp-milis' logical type that I would like converted to a C# type.
https://avro.apache.org/docs/1.8.0/spec.html#Timestamp