litvis icon indicating copy to clipboard operation
litvis copied to clipboard

Enrich narrative schema generation with rules that use regular expressions

Open jwoLondon opened this issue 6 years ago • 0 comments

Currently (as of [email protected]), narrative schemas can use rules that

  • require the presence, or a number of instances, of a label (minimumOccurrences)
  • require minimum content length within a paired label (minimumTrimmedTextLength)
  • require that one label follows another, allowing label orders to be defined (followedBy)

What we are not able to do yet is validate against specific content within paired labels. One approach would be to add the ability to validate against a regex. This would give us some flexibility in validating content.

One open question is whether we simply allow a boolean regexMatch (as below), or whether we can process the returned matched values in some way within the YAML rule definition.

Some Examples

Vis Algebra

We might require that for a vis algebra schema, content must include reference to hallucinators, confusers, jumblers and misleaders under the relevant 'principle' assessment:

 - description: invariance assessment must make reference to hallucinators.
    selector:
      label: invarianceAssessment
    children:
      regexMatch: [hH]allucinator

 - description: unambiguity assessment must make reference to confusers.
    selector:
      label: unambiguityAssessment
    children:
      regexMatch: [cC]onfuser

 - description: data-visualization correspondence assessment must make reference to jumblers and misleaders.
    selector:
      label: correspondenceAssessment
    children:
      regexMatch: (?=.*[jJ]umbler)(?=[\s\S]*[mM]isleader).+

Unit Testing

Rather than just add a checkbox to some content that requires a subjective judgement as to whether it has passed, we might use regex to validate that the user has confirmed the test has taken place and has passed:

 - description: must include a '[ ] passed?' checkbox.
    selector:
      label: myTestableLabel
    children:
      regexMatch: \[.\] [pP]assed\?

 - description: test has not passed.
    selector:
      label: myTestableLabel
    children:
      regexMatch: \[x\] [pP]assed\?

These are only suggestions for the syntax. We might need to escape special characters depending on how tolerant YAML is.

jwoLondon avatar Oct 29 '18 10:10 jwoLondon