pywps
pywps copied to clipboard
Should PyWPS automatically set minOccurs=0 when literal data default is provided?
Description
Field min_occurs=1
is the default of inputs. When default
parameter is provided to define a default literal value, that default 1
occurrence loses its meaning. I would imagine most people expect that adding this default
parameter will make their process pick that value if it was omitted, but this will never happen due to min_occurs=1
check at execution time. The process description also displays ``minOccurs=1and
defaultValue` simultaneously, which makes the default behaviour hard to understand.
I am questioning whether PyWPS should be smart about it and automatically adjust the value to 0
when default
is provided, since it doesn't make sense to have that default if it cannot be resolved by default when omitting the input value.
Although the fix is relatively simple to manually adjust any process I/O incorrectly defined, I find this is a quite common and recurring issue. See for example additional information which is just a list of cases I found within a week, over many WPS process implementations.
Environment
- operating system: ubuntu 20.04
- Python version: 3.8
- PyWPS version: latest
- source/distribution
- [ ] git clone
- [ ] Debian
- [x] PyPI
- [ ] zip/tar.gz
- [ ] other (please specify):
- web server
- [ ] Apache/mod_wsgi
- [ ] CGI
- [ ] other (please specify):
Additional Information
Following related issues/PR where similar problem was noticed:
- https://github.com/bird-house/flyingpigeon/issues/342
- https://github.com/bird-house/finch/pull/199
- https://github.com/crim-ca/weaver/issues/17
- https://github.com/crim-ca/weaver/pull/326
Instead of being smart and make an automatic adjustment, maybe PyWPS could raise a Warning when instantiating the Process. We'd catch these issues during development and resolve them explicitly.
@huard I guess that would be a good compromise without directly impacting existing process definitions.
I have added a new process in Emu to show the usage of default values: https://github.com/bird-house/emu/pull/114
Currently the default value is used when no value is provided both for min_occurs=0
and min_occurs=1
.
Is this what is expected or should it be different?
@cehbrecht
Maybe my interpretation is wrong, but I believe that if an input has min_occurs=1
, and you don't provide it during Execute request, the process should return an error indicating it is missing a required input. Therefore, the default would never be used, because you must provide one.
I dug down in the specs.
OGC says that definitions of minOccurs
, maxOccurs
and default
should behave as defined by XML schema specifications when parsing WPS-1.
Looking at https://www.w3.org/TR/xmlschema-0/#OccurrenceConstraints It is mentioned (emphasis added by myself):
In general, an element is required to appear when the value of minOccurs is 1 or more.
and
When an attribute is declared with a default value, the value of the attribute is whatever value appears as the attribute's value in an instance document; if the attribute does not appear in the instance document, the schema processor provides the attribute with a value equal to that of the default attribute. Note that default values for attributes only make sense if the attributes themselves are optional, and so it is an error to specify both a default value and anything other than a value of optional for use.
There is a distinction between presence of the input and its contained value. Looking at the example table, following case is provided:
Elements (minOccurs, maxOccurs) fixed, default |
Attributes use, fixed, default |
Notes |
---|---|---|
(0, 2) -, 37 | n/a | element may appear once, twice, or not at all; if the element does not appear it is not provided; if it does appear and it is empty, its value is 37; otherwise its value is that given; in general, minOccurs and maxOccurs values may be positive integers, and maxOccurs value may also be "unbounded" |
Therefore, it looks like it is valid to have min_occurs=1
and default
at the same time, but one must still provide the input with an empty value to employ that default. Omitting the input should raise a schema validation error of missing input.
So the test provided by @cehbrecht that defines input string_2
with min_occurs=1
and default='two'
(https://github.com/bird-house/emu/pull/114/files#diff-489abd19af7c899a97437774b5d153eb55c7110d3d49d1f0ef23421fc44afd7fR29-R34) which resolves as 'two'
when omitted (https://github.com/bird-house/emu/pull/114/files#diff-44189833c54dd811c26c176a0ad92fd2f40c78d15ef8ceb5ae7d23625ec6fb5dR27-R30) is invalid.
I believe PyWPS is not respecting the spec in this case.
The input should be provided explicitly in the execute XML body, but with empty value to respect the minOccurs=1
schema validation and use the default='two'
. Something like <wps:Data></wps:Data>
must be provided. The JSON equivalent would be {"string_2": null}
instead of not having "string_2"
in the body at all.
That being said, I think the intended use of default
parameter in PyWPS is used to simply omit the input rather than explicitly providing it as empty, since the code currently picks that default value when the field is omitted. Therefore that input should be defined with min_occurs=0
to be compliant according to XML.
The current pywps implementation seems to use the default
value for initialization ... no matter the min_occurs
value:
https://github.com/geopython/pywps/blob/a8076216d2b2db0962bf3fad22f84059193447e6/pywps/inout/basic.py#L937
https://github.com/geopython/pywps/blob/4bc485021e5857a98964ee33e65cd823ca7e50ae/pywps/app/Service.py#L117
There is a previous discussion #226
I still find it confusing :)
The spec says (?):
- A default value is only used when
min_occurs=0
. - The default value is only set when a value is provided and it is empty.
I find it hard to set and recognize empty
values. So we could adjust the implementation in this way:
- A default value is only used when
min_occurs=0
. - The default value is only set when a value is not provided (or empty).
In case of min_occurs>=1
and a default value is configured we raise a warning?
Currently we have min_occurs=1
as default. Should it be min_occurs=0
?
I think the "conceptual" min_occurs=1
by default is valid. By this, I mean that user not explicitly providing a value for min_occurs
should default to 1
. Most users intend inputs to be required unless a default
is added or min_occurs=0
is explicitly provided.
What we could do is define min_occurs=None
as function parameter, and consider following cases:
- If the user does provide
min_occurs
explicitly, we swapNone
to its value - If the user provided
min_occurs=0
and nodefault
, the internal value ofdefault=None
is used and that input is optional. - If the user provided
min_occurs=0
and somedefault
value, then internaldefault
is set to that value. The input is optional, but omitting it during execute uses thedefault
. - If the user provided only
default
(without anymin_occurs
),min_occurs=0
is used instead of1
(resolve transparently). - Following those resolution,
min_occurs>=1
anddefault
is set, then raise a warning. Thedefault
could be ignored completely in this case (replace it byNone
). The only case when that would happen is when the user explicitly provided both parameters with erroneous/conflicting values.
I don't think we should worry about the case of "empty value provided", because the resulting value in Python code, whether parsed from XML or JSON, will result in None
regardless if the input was omitted (i.e.: via using default=None
) or explicitly passed empty/null (i.e.: via parsed value conversion to None
).
Instead of being smart and make an automatic adjustment, maybe PyWPS could raise a Warning when instantiating the Process. We'd catch these issues during development and resolve them explicitly.
@fmigneault I would agree with @huard and only raise a Warning when a default
is given and min_occurs>0
. The default
will only be used when min_occurs=0
and no value is provided.
I think the main trouble is the "mental confusion" ... it will be easier to update the wps services and make it clear in the docs and examples.
Agree? Should we do this already for the 4.4.x branch or only on 4.5.x?
@cehbrecht
I am fine with a warning. It should offer sufficient information for users to manually updated it as needed.
I think it is worth having in 4.4.x (and any other patches) until 4.5 is completely passing the test suite and properly released (as 4.6? or 4.5.x+1 according to #590).