need an method generate schema from struct
type SimpleRecord struct {
A int64 avro:"a"
B string avro:"b"
}
schema, err := avro.ParseStruct("org.hamba.avro", "record", &SimpleRecord{})
because the field information is dup in struct and schema
What is the question/proposal here?
I thinks this is a duplicate of #263.
@longbai , here is my advise after our own experience. We tried to think in this way in the past and realized this is not the correct way to think about it.
With Avro, you should always start with the schema first. Then, for convenience, you can generate Go structs from that schema so you can create typed objects to serialize/deserialize for that schema in particular. But this is technically optional as you can also serialize/deserialize to/from generic maps as well without Go structs.
If using structs, your program should always have a copy of the schema in the binary to use for serialization/deserialization (that is, you need a parsed schema object). If using generic maps, you can have the schemas external to the binary, e.g. from a schema registry or on disk and read them in runtime.
In our experience, an easy way to manage this in Go is to use the go embed tooling. In this way you can have the schema on disk and make sure the binary always contains the correct schema for which it was designed to work.
The biggest reason to embed the schema into the binary, is that it does not make much sense to change the schema and not the binary. If you change the schema, is highly likely that you have to change something in your program as well. In other words, Avro schemas are highly coupled to the program using them.
@nrwiersma please correct me if the above is still not the right way to think about it. This is the way we are using this great library with lot of success in our software components.
@hhromic In general that is correct.
In my current project, I chose to use Avro mainly because of its flexibility regarding schema generation.
I tend to disagree and consider Avro a friendly format when following a code-first approach compared to other formats such as Protocol buffer (because of the absence of tag numbers)
The Encoding and Evolution chapter in the Designing Data-intensive applications book (by Martin Kleppmann) highlights this point.
Its not that you cannot generate the schema from objects, but that the schema is the source of truth for encoding and decoding data in Avro. There are of course use cases where the objects make up a central definition of data exchange (a good example of this is Kubernetes) but this is not the general use case. And even in these cases, the object is only used when working in the same language as API Server. For example, were Kubernetes to support Avro in its API, the object is only useful in Go, in any other language I must depend instead of the fixed Schema generated from those objects.
👍 thanks @nrwiersma @hhromic
I have a code to dynamic generate a go Struct from a avro.Schema. I use this to a code to read and write avro with schema input only. avroSchema2Struct.go Let me know if it can help you.
This is not a feature that will be added to the library, as the Schema is the source of truth in Avro and the module.