GEOS icon indicating copy to clipboard operation
GEOS copied to clipboard

XSD Schema versioning

Open untereiner opened this issue 1 year ago • 8 comments

What is the requested feature? Add versioning to the XSD schema. There are few ways to do it (summary), more or less standard. So I would like to open the discussion.

I put aside the result of the simulation as out of scope. I only want to be sure that the objects and their parameters are valid to start a simulation.

Is your request related to a specific problem? I want to identify a revision of GEOS' code corresponding to the xsd schema I have used to validate a deck My point is: given a deck and the xsd schema validating it, I should be able to retrieve the code that runs this simulation to compile it and run it again. I am also wondering:

  • Should the deck be sufficient ?
  • Should the xsd allow evolving: if v0 of the schema validates my deck and v0 is a valid subset of v10 of the schema, should I be able to start the simulation only by schema validation ?

Describe the solution you'd like I am open to discussions. Each solution has its pros and cons.

Describe alternatives you've considered save the commit hash somewhere (?)

Additional context n.a.

untereiner avatar Feb 01 '24 14:02 untereiner

i don't think passing schema validation guaranties the case to run and finish there are tons of internal geos error checking that are not reflected in schema at all there is this interesting initiative https://github.com/GEOS-DEV/GEOS/issues/2940 that may ultimately help to connect the two

paveltomin avatar Feb 01 '24 15:02 paveltomin

It is not exactly the point to be able to finish a run. It is more to be able to match a set of xml nodes wrt the classes in geos that can instantiate these nodes. But if your schema refers to a specific git hash in the history you will presumably get the version (and its dependencies) that had been able to run and finish your simulation

untereiner avatar Feb 01 '24 17:02 untereiner

@untereiner - we could probably write a script that does this given the current setup. Essentially, we would need to loop over the GEOS hashes from the current backwards (I'm not entirely sure how to do this), pull the associated version of the schema, and if it validates, return the hash.

I think that the larger issue here is that the schema is still changing fairly often. We should try to be more judicious about this, and perhaps have a higher bar to pass for schema changes.

cssherman avatar Feb 01 '24 19:02 cssherman

@cssherman The xml schema 1.1 has a namespace called Versioning and conditional inclusion that provides minVersion, maxVersion that whould help describe the code changing.

In terms of C++ I do not know for now how to compute for each wrapper some kind of versioning to reflect this in the xsd wihout annoying programmers

untereiner avatar Feb 02 '24 12:02 untereiner

For me, everything related to xsd/xml validation is like filling the bathtub of the Danaïdes and will end up wasting energy for a result that can't be perfect. If we really want to be doing something, I'm convinced that we should be rushing a solution like proposed in https://github.com/GEOS-DEV/GEOS/discussions/1911 (or any other). With @castelletto1 we've been working on a demonstrator in branch feature/controlledInput and we're going to present our work during next yearly meeting.

TotoGaz avatar Feb 02 '24 17:02 TotoGaz

@TotoGaz yaml isn't typed. Then instead of having a schema that changes regularly but on which you can trust, you'll end to n-times trail and error on your entries until your deck matches. See how many commits you need to have your CI scripts running ? At least I suggest something like dhall. You'll have high level constructs without rigidity of xsd

untereiner avatar Feb 05 '24 14:02 untereiner

@TotoGaz yaml isn't typed. Then instead of having a schema that changes regularly but on which you can trust, you'll end to n-times trail and error on your entries until your deck matches.

yaml surely, but can't we add typed definitions aside? In the same same way as today it's more or less xml + xsd?

See how many commits you need to have your CI scripts running ?

Actually, from my own XP, very few w.r.t. yaml scheme (mostly) because the GHA extension of vscode is very efficient (and not only for the yaml, there's much more than yaml in GHA). When we have to push another commit, it's often because it's complicated to know in what state the runner really is remotely, and quite never because of a yaml description...

At least I suggest something like dhall. You'll have high level constructs without rigidity of xsd

Why not, we need to see how it goes. I do not really care changing the language, but let's see the implications. What I care about is that we do nothing.

TotoGaz avatar Feb 05 '24 17:02 TotoGaz

@cssherman The xml schema 1.1 has a namespace called Versioning and conditional inclusion that provides minVersion, maxVersion that whould help describe the code changing.

In terms of C++ I do not know for now how to compute for each wrapper some kind of versioning to reflect this in the xsd wihout annoying programmers

That's pretty cool, but it sounds like it would be an absolute nightmare to maintain with our current approach!

cssherman avatar Feb 05 '24 17:02 cssherman