javarosa
javarosa copied to clipboard
Validate reports bogus type mismatch error when checking XPath date arguments
Software versions
javaRosa as it exists in Validate as it exists in XSLForm Online v1.3.1
Problem description
Validate appears to be performing a static evaluation of XPath expressions as part of its validation of XPath date/time/dateTime function arguments. But because it performs a static evaluation, it does not pickup valid runtime values, and as a consequence can throw a bogus type-mismatch error when it sees an invalid dateTime string (even though in actual runtime this string would contain a valid date string).
Steps to reproduce the problem
see https://forum.opendatakit.org/t/validate-issue-for-decimal-date-time/17710/6
specifically:
The simplest testcase I was able to come up with exhibiting the behavior is the above, namely
date: year, default=2019 result = decimal-date-time(concat(${year},"-01-01"))
which is valid and runs fine in Collect, but causes a form type-mismatch error with Validate. I suspect this may only apply to date/time functions type-checking, but until we can isolate it in the code I cant be 100% certain this wont pop up in other non-date related cases.
Expected behavior
Validate should either not attempt to (statically) evaluate XPath expression in order to guess their type, for the purpose of type checking function arguments, or must fully evaluate all XPath expressions (infeasible?), including any defaults.
Other information
workaround requires addition of an explicit (and unnecessary) coalesce() call to effectively initialize datetime function arguments to something that will result in a valid datetime string, just for the purpose of validation.
Please note that I havent started to try and figure out where in javaRosa/Validate it is doing this supposed static type-checking evaluation. So although it is readily reproducible for dateTime function argument checking, I dont know whether this 'static' analysis might also have failure scenarios wrt to other data types. I would want to be sure it doesnt before closing out this bug, but till we can isolate the actual code involved its probably not possible to say categorically it doesnt.
I think what's going on here is that Validate steps through the form as if it were a form-filling client. It never gets a year value so it attempts to evaluate decimal-date-time("-01-01")
and that leads to the crash.
Related to #481.
my suspicion is that Validate may just be evaluating each binding XPath expression in situ, without actually resolving any (?) instance XML element references to their actual value (substitutes '' instead?). Hence doesn't pickup any defaults.
Can you please share your full form? Did you try setting a default of e.g. 2019 for year and still get that behavior?
try this:
BTW XLSForm Online used to throw a type mismatch error, just tried it now and its barfing with "java not found" on it (!) ha. I assume 'cause its now crashing on it...?
Form conversion with XLSForm Online v1.6.1 at http://opendatakit.org/xlsform/ succeeds. It also validates from Validate 1.13.2 directly. I can reproduce the type mismatch if I omit the default, however.
I now see that you had listed a default in your original post but I'm wondering whether somehow you might not have it in the form?
I can't really think of any action to take, here. Perhaps there should be better documentation about the default requirement in a case like this but I don't know where that could go.
Let me try to reproduce.
Looking back at the forum thread this came from, it looks like there was no default specified.
For better or worse, Collect doesn't show an exception on load when the argument is "-01-01"
because type exceptions on first form computation are swallowed. However, an exception will be shown if the year value is changed to another invalid value like -1, "cheese", etc, or changed to a valid year and then cleared. That exception can be dismissed by the enumerator.
Ideally, form designers would include appropriate constraints so that the runtime exceptions would never be needed but I think they're useful as fallbacks.
In Validate, any type of runtime exception hit while traversing the form make validation fail. We could silence type exceptions because it's true that they are likely to fail in the absence of defaults. I argue that it is helpful to users to get a hint that something isn't quite right and that they should provide defaults and constraints. Defaulting to the current year and constraining the year between 0 and the current year (or year + 5 or whatever is appropriate for the use case) would make for a better form design.
No matter what, the current exception messages are cryptic. They should at the very least be explicit about what argument value caused the problem. I've proposed wording at #494.
The issue was that even with a default, for ${year], Validate was throwing an error. That appears to have been fixed recently along the way somehwere (?)...
But the root issue - as I see it - remains, that Validate will flat out reject a valid form containing
decimal-date-time(concat(${year},"-01-01"))
Even if, say, ${year} is a required number question [unless you work-around the bug by giving ${year} a default, or replacing it with a coalesce(${year},"2019"), etc....].
I do see the benefit in performing a static analysis of all calculations - when you attempt to load a form - as a means to pickup errors. But fundamentally I dont believe you can categorically state in the general case that a calculation is invalid by performing a static analysis of the calcuation; elements in an XPath expressions are inherently dynamically type cast when you actually run them. So specific elements that may, for example, appear to be null initially (and hence may cause the expression not to evaluate correctly) may in fact be perfectly OK when you actually run the form normally.
I dont think type check mismatches are a particularly common user error [correct me if I'm wrong] so would it be sufficient for Validate to instead flag these as warnings instead of fatal errors? (since Validate cant actually be certain they are in fact actual errors, eg above example).
Note, the fact there is now a non-intrusive workaround - ie being able to set the default - is certianly a lot better state of affairs than having to rewrite calculations using coalesce(), which was very intrusive. Which is to say, if you push me on the issue, I'll probably back down... :-)
Ideally, form designers would include appropriate constraints so that the runtime exceptions would never be needed
As it happens, making the calculation only relevant if ${year} != '' has no effect - Validate still barfs. Basically, you have to fake out Validate with a default or coalesce workaround. Again, this goes to the root of the problem IMO: static analysis of dynamically interpreted expressions.
This seems to related to solving the issue raised in this thread.
I have simplified a form showing the issue using the example of collecting an employee identifier, using it to pull the date of hire from an external file, and then calculating the days since hire. type_mismatch.xlsx
Running the form through XLSForm Online v2.x throws an error.
XPath evaluation: type mismatch
The value "mycsv" can't be converted to a date.
When 'mycsv'
in the XLSForm is the reference within a pulldata to a filename, and the result of the pull data could very well be a properly formatted string such as "2010-08-02" or whatnot.