schema_salad icon indicating copy to clipboard operation
schema_salad copied to clipboard

Backported schemas to `codegen` branch may break portability between platforms due to handling `typeDSL`

Open tom-tan opened this issue 1 year ago • 5 comments

I'm not sure here is an appropriate place for this issue but I report it here because it is related to saladVersion.

Historically, in my understanding, the saladVersion field was introduced not to break the compatibility when implementing nested typeDSL. That is:

  • Schemas of CWL v1.0, v1.1, v1.2 are defined with saladVersion v1.2 or earlier, not to handle nested typeDSL. Otherwise it may break portability between platforms
  • A Schema of CWL v1.3 is defined with saladVersion v1.3 to handle nestd typeDSL

At that time, we also discussed to introduce another field to control it but finally it was rejected.

By the way, saladVersion v1.3 also introduces map and union schemas to represent input objects and they are backported to the schemas in the codegen branch of CWL v1.0-v1.2.

That is, the code-gerenated parsers with backported schemas in saladVersion v1.3 accidentally support nested typeDSL. Not to break the portability between platforms, it would be nice if we can fix this issue.

There are several ways to fix it.

  1. Introduce a new field to control nested typeDSL
    • It was discussed when implementing nested typeDSL but finally we decided to introduce saladVersion instead
    • IMO this option is worth discussed again
  2. Give up to backport map and union schemas to CWL v1.0-v1.2
    • It may become complex to implement and maintain the platforms for CWL v1.0-v1.2
  3. Support nested typeDSL in CWL v1.0-v1.2
    • IMO we cannot use this option because it is a breaking change to the spec
  4. Others?

What do you think?

tom-tan avatar Aug 30 '24 03:08 tom-tan

This is a blocker of #861. I prefer the option 1, that is, to introduce a new field (e.g., mustExpand?) to control nested typeDSL syntax.

If it is OK to take this approach, I will create new issues for each code generators.

tom-tan avatar Oct 17 '24 09:10 tom-tan

How about extending the typeDSL field to accept an object with the mustExpandRecursively field, for example? Here is possible values of the typeDSL field:

value expand recursive note
null no - same as typeDSL: false
false no -
true yes platform dependent
{ mustExpandRecursively: false } yes platform dependent same as typeDSL: true
{ mustExpandRecursively: true } yes yes

It is consistent with the existing spec of typeDSL (the case of typeDSL is null, false, or true) but covers recursive expansions.

Notes:

  • In the current implementation of cwltool, nested typeDSLs are not expanded recursively.
  • mustExpandRecursively is a tentative name. I need more appropriate and (hopefully) shorter name for it.

tom-tan avatar Oct 30 '24 06:10 tom-tan

I agree that this is important and needs fixing, I also think that option 1 ("Introduce a new field to control nested typeDSL") is the correct method.

mr-c avatar Nov 11 '24 08:11 mr-c

Since we have total control of the codegen branches, as they are unofficial, I suggest removing the platform-dependent behavior and adding an expandRecursively field (or similar name) that can be missing, true, or false.

mr-c avatar Nov 11 '24 08:11 mr-c

Here is an intended behaviors after introducing expandRecursively:

value expand recursive note
null no - same as typeDSL: false
false no -
true yes no
{ expandRecursively: false } yes no same as typeDSL: true
{ expandRecursively: true } yes yes

{ expandRecursively: true } will be used since CWL v1.3.

tom-tan avatar Nov 11 '24 09:11 tom-tan