gradle-avro-plugin
gradle-avro-plugin copied to clipboard
Schema changes in nested objects do not reflect in parent object
Prerequisites
- [x] Are you running the latest version of the plugin? (Check releases)
- [x] Are you running a supported version of Gradle? (Check the README)
- [x] Are you running a supported version of Apache Avro? (Check the README)
- [x] Are you running a supported version of Java? (Check the README)
- [x] Did you check to see if an issue has already been submitted?
- [x] Are you reporting to the correct repository?
- [x] Did you perform a cursory search?
I updated the schema of a class (class B) that is used by another avro schema (class A).
Class A's SCHEMA$
variable was not updated with the new nested fields, resulting in serialization errors.
Thank you for the issue report. A sample project with detailed reproduction steps would assist greatly with ensuring that we’re discussing the same behavior.
It seems possible that you are running into problems related to the behavior introduced by AVRO-150. Specifically, in SpecificCompiler, when asked to compile a schema to Java code, it checks whether the destination file is more recent (by file system last modified timestamp) than the source file, and if so, silently skips doing anything.
In most cases, that should be fine: you change a file (which makes the last modified newer than the previous build artifacts), you run the build, and it re-compiles. In some cases, it can be sort of weird, though. Some examples:
- Your source files are in a version control system that messes with last modified timestamps, and you check out a different branch, resulting in source files that have changed but have older time stamps
- Your system's concept of what time it is goes backwards (daylight saving time, time zone change, NTP, unreliable system clock)
- You have multiple schema files that result in the generation of a given Java source file; the generation based on one of them then results in the generated file being more recent than the other source files
For many such situations, deleting the generated Java source files (such as via the clean
task) and re-running the build should be sufficient to get to a consistent state.
My plugin attempts to prevent situations where unreliable results are produced by parsing all source schemas that it has been provided, checking for duplicate definitions of any types (nested or otherwise) and intentionally failing if any type definitions conflict (that is, there are two or more definitions for what the contents of a given Java source file should be). That said, there are ways to get into this sort of situation that the plugin couldn't detect. A couple possible solutions to this class of problem would be to ensure that all class generation for a given destination directory happens in a single GenerateAvroJavaTask
run or extracting all nested types to their own source files.
CC: @tomasAlabes
Thanks for the research @davidmc24! In my case cleaning the build
folder with the classes wasn't enough, I had to go for the build cache. After that it worked again.
@tomasAlabes Thanks for the explicit call-out of the build cache. I don't have much experience with that yet, and wouldn't have thought to clear it.
@TalMoshel Actually, upon re-reading your submission, I think I understand now that you're referring to a related but different problem in this ticket than ones I've attempted to help with before. If you have nested types extracted to separate files, but don't change the parent file, and re-compile, Avro will refuse to re-write the parent file.
This is a type of failure that will be resolved by a clean build, and that's the behavior that the upstream project seems to want to encourage by how they handle this.
In terms of ways that this plugin might be able to do better...
- Previous versions of this plugin pro-actively cleaned their destination folders to work around this behavior. This caused problems for some people who configured their destination to be a source folder; we won't be going in that direction again.
- It might be possible to update the last modified timestamp of the source file immediately before calling SpecificCompiler to compile it. I'm not sure what additional impact that would have.
- We could compile the files to a temporary directory, and then pull the results into the real destination, at the cost of additional I/O.
Beyond that, I think the main way of changing it would be upstream (to remove, or at least allow disabling the option in Avro's SpecificCompiler
class).
I have the same problem and for me even the cache cleanning it wasn't enought. My schema looks like this:
(for reference, my version is 1.2.1 and my gradle wrapper is 6.8)
error:
{
"type": "record",
"name": "SomeEvent",
"namespace": "someNameSpace",
"fields": [
{
"name": "data",
"type": {
"type": "record",
"name": "SomeEventData",
"fields": [
{
"name": "Boleto",
"type": {
"type": "record",
"name": "Boleto",
"fields": [
{
"name": "data_vencimento",
"type": [
"string"
],
"default": null
}
]
}
}
]
}
}
]
}