MaterialX icon indicating copy to clipboard operation
MaterialX copied to clipboard

Discussion about breaking changes and versioning

Open ix-dcourtois opened this issue 2 years ago • 9 comments

Hi,

I'd like to open a discussion about a little friction when updating MaterialX library: breaking changes :) This is a bit touchy because I know that from a library's maintainers' point of view, it's very tempting to evolve the API to keep it as clean as possible, at the risk of breaking things for user code. But on the other hand, every time I update MaterialX in our build system, I have to spend a fair bit of time to fix things. Sometimes it was quite simple, some other times a lot trickier. And most of the time at places where I didn't expect it :)

Anyway, MaterialX being quite new, I understand that the API is not stable enough yet, and trying to make it evolve while trying to keep it backward compatible could quickly become a maintenance nightmare for you. But I feel there must be a sort of middle ground that would allow us to update without too many headaches, and you to make changes without dritying things too much, thus this discussion.

For breaking changes in the C++ API, a solution could be to flag things as deprecated while keeping them, and add the new things in parallel. And once in a while, cleanup the deprecated APIs. Ideally, the cleanup (e.g. breaking change) would happen on minor version update (e.g. from 1.38.5 to 1.39.0 for instance) This way, patch version update could still introduce new functionality, but it would guarantee that it wouldn't break user code: we would be able to update a lot easier, get new functionality and fixes without problem, and we would have time to convert the deprecated APIs that we're using before updating to a new minor version.

The second kind of breaking changes are ones related to data: for instance between the 1.38.3 and the 1.38.5, the directory structure of libraries/genosl changed, and the osl headers mx_funcs.h moved around. This broke compilation of generated OSL code because the include paths were no longer set correctly, so we had to patch our compilation options to account for that. Those kind of changes are a bit more difficult to handle I think because there is no clean deprecation mechanism for those... Maybe at least defer those changes to a minor version update? That way patch version update would guarantee that we could update without problem?

Sorry for the length :) Any thoughts, ideas?

Regards, Damien.

ix-dcourtois avatar Jul 06 '22 15:07 ix-dcourtois

Thanks for starting this discussion, @ix-dcourtois, and I think it's very worthwhile. To align terminology for this thread, I'll point out the documentation on the MaterialX versioning system, so that we have a common language for describing the MaterialX versioning capabilities as they stand today:

https://github.com/AcademySoftwareFoundation/MaterialX/blob/main/documents/DeveloperGuide/MainPage.md#materialx-versioning

jstone-lucasfilm avatar Jul 06 '22 15:07 jstone-lucasfilm

Ah yes my mistake, I thought the 3 digits were the usual major.minor.patch. Well, the same reasoning can be had with this versioning scheme, but I'm guessing that the spec version will not be bumped very often, making cleaning things a lot less frequent, and thus more difficult from the library's maintainers point of view. Maybe using a 4 digit version number (like OpenShadingLanguage) could allow a bit more flexibility (something like major.minor.build.patch) ? This way the build number could be split in 2, and once in a while (de-correlated from the spec updates) the build version can be updated to clean deprecated stuff. Not sure this is a perfect solution, but just an idea.

ix-dcourtois avatar Jul 06 '22 15:07 ix-dcourtois

A patch identifier was considered before so maybe something like this could work (?):

  1. If there is any non-backwards compatible change or a data representation change then the patch level is bumped. It may or may not be possible to be backwards compatible all the time. Deprecated API can be marked as such.

So next patch would be 1.39.5.1.

  1. Patch level is not part of the version. Issues I've run into the past is that there was no indication in the data file that within 2 build releases or within 2 minor releases anything had changed. A suggestion from before was to add a patch attribute to each document. I'm not sure but build.patch I think would be make it non-ambiguous. There would be an include directive and library versioning support as well.

  2. At build version and minor release time it is allowed to perform deprecation and breakages. There hasn't been a major version change yet, but I assume this could be for non-backwards compatible data model changes which cannot be upgraded (at all or not fully automated)

I still am leaning towards having build / patch number be considered of the document upgrade logic for data files as it would remove headaches (and disgruntled integration teams :)) of not knowing if the data format has changed or not.


As for tagging in the repo, I've heard that folks want to create specific branches for releases, but I think I'd still prefer to go with the single branch + tags. A question is do we tag patches ? -- I think it would be useful.

kwokcb avatar Jul 07 '22 02:07 kwokcb

@bernardkwok That's an interesting idea, though I was imagining the reverse of your proposal, where a patch update denotes a set of fixes and improvements that have no impact on the MaterialX API or the layout of its data libraries. This would align with the idea that it's a "smaller" update than we denote with the existing build version number. This might be used, for example, to put out a set of minor fixes to an existing MaterialX release, and the use of the patch version would convey the idea that developers can upgrade to this version without any changes to their own code.

jstone-lucasfilm avatar Jul 07 '22 21:07 jstone-lucasfilm

@jstone-lucasfilm My thoughts exactly!

ix-dcourtois avatar Jul 08 '22 06:07 ix-dcourtois

@jstone-lucasfilm, @ix-dcourtois I didn't read carefully enough that your looking for the lowest level for pushing out "non-breaking" changes. Perhaps I'm thinking more of file-format bumps where everything breaking must have a version bump, so I would still put forth the suggestion that we formally add "patch" as part the document as required.

I'm wondering if it's good to explicitly lay out what is considered "breaking", how long old behaviour needs to stick around, and what if there is no way to keep both.

Examples:

  1. API compatibility: or something like an signature change, we keep both signature working, mark the old signature as deprecated for the patch. During the next Nth minor release we are allowed to remove it where N = some number of releases.or the next major release? Signature could be C++, Python, JS etc.

  2. Data Model compatibility: For data model upgrades (including signature changes for nodedefs), I assume as long as there is an upgrade path and does not remove anything then these are not "breaking" and can be part of a patch. e.g. Adding new nodedefs is not breaking.

  3. Logic compatibility: For behaviour / logic changes like the library search change, if this cannot have a fallback / alternative then it must wait until the next minor release, otherwise it can be part of a patch. e.g. Is changing shading model node behaviour breaking, or is changing the OSL matrix / vec4 logic breaking (both of which occurred for 1.38.5) ?

  4. Build logic compatibility: If dependencies or build configuration changes then this is breaking.

Perhaps I'm overthinking this, but in practice breaking and non-breaking changes can inter-mix so pushing breaking changes to a minor release could hold up that and dependent changes for months as there is no branching for versions.

Please let me know if I'm too noisy on this the details of this :)

kwokcb avatar Jul 11 '22 13:07 kwokcb

It's like trying to get one number or dotted stream of numbers to do the job of SemVar for API, and simultaneously SemVar for what can be reified in a material graph. The thing about SemVar and any dotted number scheme is there's an implication of strict ordering, such that a.b.c >= d.e.f iff a >= d & b >= e & c >= f.

It's not clear to me reading this thread that there is a strict ordering possible when api and data model compatibility are not also strictly ordered. In other words, data model could increment many times without incrementing api, and vice versa. I think the only thing that makes sense is something like 1.38.6_d1.38.6 where d tags the data model (so one doesn't forget which is which every time you look at a version or are responsible for getting a release published ;) ) and then over time the two numbers increment independently. The suggestion of 1.38.6_d1.38.6 comes from the idea of establishing an epoch, the zero-day when both API and data number were both thought to correspond one to one.

If any one has a better analysis of the mathiness of it all that doesn't require six numbers, I absolutely stand to be schooled :)

Postscript - the argument against is that all of the linux tooling around being smart about versions doesn't work with this scheme, but I'd argue that since matx isn't following semvar anyway, that was never an argument in the first place.

Post Postscript - it also begs an argument about SO numbering, presumably SO numbering increments by one, every time either API or Data model advances. I don't know if matx already has an SOVERSION policy in place.

meshula avatar Jul 11 '22 17:07 meshula

@meshula I think it's reasonable to assume a strict ordering, where the specification version would never increase without clearing the build and patch versions to zero. So when we ultimately release a 1.39 specification for MaterialX (e.g. containing proposed improvements to the connection syntax between nodes), I would expect the next released codebase to be 1.39.0 (or 1.39.0.0 if we assume the addition of a patch version).

While we have historically provided deprecated wrapper methods to inform clients about changes required in their code (and we're interested in improving this deprecated functionality system as suggested by @kwokcb above), I wouldn't expect a client application upgrade to this 1.39.0 codebase without corresponding upgrades to their own code as well.

jstone-lucasfilm avatar Jul 11 '22 18:07 jstone-lucasfilm

where the specification version would never increase without clearing the build and patch versions to zero.

That makes sense to me. From the conversation so far I was starting to struggle to identify if that, or some similar mechanism, was the case or not.

meshula avatar Jul 11 '22 18:07 meshula

Thanks for starting this discussion, @ix-dcourtois, and we've clarified the rules for changes allowed in version upgrades in https://github.com/AcademySoftwareFoundation/MaterialX/pull/1664.

jstone-lucasfilm avatar Jan 20 '24 22:01 jstone-lucasfilm