Conditional pattern set
I would like to be able to specify a condition (as xpath to be applied to the input document) to check if a set of patterns should be executed.
I have two use-cases for this, both from a DITA context:
- Avoid detailed validation of a document thta is marked as "in revision". (see #18)
- When using the same schematron file for a whole set of DITA files of different topic types there are likely to be some rules that can only match on elements within specific topic types. So I might have a context pattern like
*[contains(@class, ' custom-domain/myElement ')]. When processing a standard DITA topic that does not use the custom domain this pattern will never match - and I already know this from looking at the root element. One option would be to extend the pattern to/*[contains(@class, ' custom-domain/myTopic ')]//*[contains(@class, ' custom-domain/myElement ')]. But the resulting xsl:template will still be checked for every single node. So adding some condition to the pattern-element (or preferably a set of patterns) could skip the whole traversal for all files not fulfilling this condition.
In Schematron, I decided againt putting in a special class of guard paths, because guards usually hide assertions. Similarly, I didn't want to support arbitrary composition of assertions (i.e. using and/or or case statements instead of rules) in order to encourage/enforce a flat structure of simple statements.
So, without claiming this is good enough for you, the most efficient way of doing guards currently is to use a top-level boolean variable for the guard condition, so you have <sch:let name="my-topic" value="/[contains(@class, ' custom-domain/myTopic ')]" /> ... <sch:rule test="[$my-topic][contains(@class, ' custom-domain/myElement ')]"> ....
This still visits every node, but only has a simple pre-calculated boolean test for each node in the worst case.
But it does make sense to me that there could be a good (command-level) optimization for the case you give: whether that is a guard provided for patterns (an extension of the new pattern/@document feature?) or for phases I don't know.
In fact, I think all that is needed is to mark the pattern so that as soon as one rule fires (i.e. a dummy top-level rule with context "/[$my-topic]" no subsequent rules need to fire.) This would be a variant of the other optimization that if one assertion fails, no other assertions or rules need to be tested. Would that satisfy your requirement?
Thanks for your thoughts.
The disadvantage of adding the condition to the patterns is a poor worst-case-behavior: So when there is only a single rule in that pattern still the whole document will be processed.
However, I agree that the @followup on a rule could do the same and is even more powerfull since it is not limits to conditions on document level but can deactive the validation of any subtree.
Just as a comparision of the source-code variants to get a feeling:
Using a condition on a pattern:
<sch:pattern condition="contains(/*/@class, ' custom-domain/myTopic ')">
<sch:rule context="*[contains(@class, ' custom-domain/myElement ')]">
<sch:assert test="...">
<!-- message -->
</sch:assert>
</sch:rule>
</sch:pattern>
Using @followup on a rule:
<sch:let name="is-not-myTopic" value="not(contains(/*/@class, ' custom-domain/myTopic '))"/>
<sch:pattern>
<sch:rule context="*[$is-not-myTopic]" followup="skip-content">
<!-- no tests, just abort the validation -->
</sch:rule>
<sch:rule context="*[contains(@class, ' custom-domain/myElement ')]">
<sch:assert test="...">
<!-- message -->
</sch:assert>
</sch:rule>
</sch:pattern>
I'm gonna start the implementation of my validation framwork within the next 2-3 weeks using @followup and will post my experience here...