javarosa icon indicating copy to clipboard operation
javarosa copied to clipboard

Validate reports bogus type mismatch error when checking XPath date arguments

Open tiritea opened this issue 6 years ago • 11 comments

Software versions

javaRosa as it exists in Validate as it exists in XSLForm Online v1.3.1

Problem description

Validate appears to be performing a static evaluation of XPath expressions as part of its validation of XPath date/time/dateTime function arguments. But because it performs a static evaluation, it does not pickup valid runtime values, and as a consequence can throw a bogus type-mismatch error when it sees an invalid dateTime string (even though in actual runtime this string would contain a valid date string).

Steps to reproduce the problem

see https://forum.opendatakit.org/t/validate-issue-for-decimal-date-time/17710/6

specifically:

The simplest testcase I was able to come up with exhibiting the behavior is the above, namely

date:	year, default=2019
result = decimal-date-time(concat(${year},"-01-01"))

which is valid and runs fine in Collect, but causes a form type-mismatch error with Validate. I suspect this may only apply to date/time functions type-checking, but until we can isolate it in the code I cant be 100% certain this wont pop up in other non-date related cases.

Expected behavior

Validate should either not attempt to (statically) evaluate XPath expression in order to guess their type, for the purpose of type checking function arguments, or must fully evaluate all XPath expressions (infeasible?), including any defaults.

Other information

workaround requires addition of an explicit (and unnecessary) coalesce() call to effectively initialize datetime function arguments to something that will result in a valid datetime string, just for the purpose of validation.

tiritea avatar Feb 13 '19 09:02 tiritea

Please note that I havent started to try and figure out where in javaRosa/Validate it is doing this supposed static type-checking evaluation. So although it is readily reproducible for dateTime function argument checking, I dont know whether this 'static' analysis might also have failure scenarios wrt to other data types. I would want to be sure it doesnt before closing out this bug, but till we can isolate the actual code involved its probably not possible to say categorically it doesnt.

tiritea avatar Feb 13 '19 09:02 tiritea

I think what's going on here is that Validate steps through the form as if it were a form-filling client. It never gets a year value so it attempts to evaluate decimal-date-time("-01-01") and that leads to the crash.

Related to #481.

lognaturel avatar Sep 16 '19 16:09 lognaturel

my suspicion is that Validate may just be evaluating each binding XPath expression in situ, without actually resolving any (?) instance XML element references to their actual value (substitutes '' instead?). Hence doesn't pickup any defaults.

tiritea avatar Sep 17 '19 00:09 tiritea

Can you please share your full form? Did you try setting a default of e.g. 2019 for year and still get that behavior?

lognaturel avatar Sep 17 '19 03:09 lognaturel

try this:

validate.xlsx

BTW XLSForm Online used to throw a type mismatch error, just tried it now and its barfing with "java not found" on it (!) ha. I assume 'cause its now crashing on it...?

tiritea avatar Sep 17 '19 03:09 tiritea

Form conversion with XLSForm Online v1.6.1 at http://opendatakit.org/xlsform/ succeeds. It also validates from Validate 1.13.2 directly. I can reproduce the type mismatch if I omit the default, however.

I now see that you had listed a default in your original post but I'm wondering whether somehow you might not have it in the form?

I can't really think of any action to take, here. Perhaps there should be better documentation about the default requirement in a case like this but I don't know where that could go.

lognaturel avatar Sep 26 '19 04:09 lognaturel

Let me try to reproduce.

tiritea avatar Sep 26 '19 05:09 tiritea

Looking back at the forum thread this came from, it looks like there was no default specified.

For better or worse, Collect doesn't show an exception on load when the argument is "-01-01" because type exceptions on first form computation are swallowed. However, an exception will be shown if the year value is changed to another invalid value like -1, "cheese", etc, or changed to a valid year and then cleared. That exception can be dismissed by the enumerator.

Ideally, form designers would include appropriate constraints so that the runtime exceptions would never be needed but I think they're useful as fallbacks.

In Validate, any type of runtime exception hit while traversing the form make validation fail. We could silence type exceptions because it's true that they are likely to fail in the absence of defaults. I argue that it is helpful to users to get a hint that something isn't quite right and that they should provide defaults and constraints. Defaulting to the current year and constraining the year between 0 and the current year (or year + 5 or whatever is appropriate for the use case) would make for a better form design.

No matter what, the current exception messages are cryptic. They should at the very least be explicit about what argument value caused the problem. I've proposed wording at #494.

lognaturel avatar Sep 26 '19 17:09 lognaturel

The issue was that even with a default, for ${year], Validate was throwing an error. That appears to have been fixed recently along the way somehwere (?)...

But the root issue - as I see it - remains, that Validate will flat out reject a valid form containing

decimal-date-time(concat(${year},"-01-01"))

Even if, say, ${year} is a required number question [unless you work-around the bug by giving ${year} a default, or replacing it with a coalesce(${year},"2019"), etc....].

I do see the benefit in performing a static analysis of all calculations - when you attempt to load a form - as a means to pickup errors. But fundamentally I dont believe you can categorically state in the general case that a calculation is invalid by performing a static analysis of the calcuation; elements in an XPath expressions are inherently dynamically type cast when you actually run them. So specific elements that may, for example, appear to be null initially (and hence may cause the expression not to evaluate correctly) may in fact be perfectly OK when you actually run the form normally.

I dont think type check mismatches are a particularly common user error [correct me if I'm wrong] so would it be sufficient for Validate to instead flag these as warnings instead of fatal errors? (since Validate cant actually be certain they are in fact actual errors, eg above example).

Note, the fact there is now a non-intrusive workaround - ie being able to set the default - is certianly a lot better state of affairs than having to rewrite calculations using coalesce(), which was very intrusive. Which is to say, if you push me on the issue, I'll probably back down... :-)

tiritea avatar Sep 26 '19 22:09 tiritea

Ideally, form designers would include appropriate constraints so that the runtime exceptions would never be needed

As it happens, making the calculation only relevant if ${year} != '' has no effect - Validate still barfs. Basically, you have to fake out Validate with a default or coalesce workaround. Again, this goes to the root of the problem IMO: static analysis of dynamically interpreted expressions.

tiritea avatar Sep 26 '19 23:09 tiritea

This seems to related to solving the issue raised in this thread.

I have simplified a form showing the issue using the example of collecting an employee identifier, using it to pull the date of hire from an external file, and then calculating the days since hire. type_mismatch.xlsx

Running the form through XLSForm Online v2.x throws an error.

XPath evaluation: type mismatch   
The value "mycsv" can't be converted to a date.

When 'mycsv' in the XLSForm is the reference within a pulldata to a filename, and the result of the pull data could very well be a properly formatted string such as "2010-08-02" or whatnot.

danbjoseph avatar Aug 23 '22 19:08 danbjoseph