OSCAL icon indicating copy to clipboard operation
OSCAL copied to clipboard

Queries for schema documentation top-down polish

Open wendellpiez opened this issue 4 years ago • 6 comments

User Story:

Consistency and usage of language and terminology in documentation including editorial policy can to a certain degree be supported with Schematron. For example, a Schematron rule can say 'descriptions should not be full sentences, only phrases' and then detect periods or other forbidden punctuation. More generally (and more lightweight), queries such as //formal-name will help us take a top-down view of the usage of this element.

Goals:

Develop and apply some simple queries to run over Metaschema source, to provide editorial support and improve the level of 'polish' of the design and resulting documents. Deploy them as either/both Schematron or (documented) XPath/XQuery.

Making the actual changes is not in scope for this Issue; it can either be done as part of regular editorial work, or spun into an Issue of its own. However, this Issue can be considered done as soon as we have some means of assurance that the docs have the needed consistency, with long-term support tooling being a spin-off goal.

Example rules (for discussion / tbd):

  • formal names should be in title case (capitalized)
  • list items should be constituted of phrasing, not sentences
  • list items punctuated with ; (?)
  • same with allowed-values/enum
  • same with descriptions
  • descriptions should not include parentheses/brackets? etc.

Dependencies:

None.

Acceptance Criteria

  • [ ] All OSCAL website and readme documentation affected by the changes in this issue have been updated. Changes to the OSCAL website can be made in the docs/content directory of your branch.
  • [ ] A Pull Request (PR) is submitted that fully addresses the goals of this User Story. This issue is referenced in the PR.
  • [ ] The CI-CD build process runs without any reported errors on the PR. This can be confirmed by reviewing that all checks have passed in the PR.

{The items above are general acceptance criteria for all User Stories. Please describe anything else that must be completed for this issue to be considered resolved.}

wendellpiez avatar Dec 10 '20 17:12 wendellpiez

Combining this with usnistgov/OSCAL#839, at least some of these items could be supported via Schematron.

wendellpiez avatar Aug 22 '22 13:08 wendellpiez

This sounds interesting and beneficial (to me, personally). Should we not declare https://github.com/usnistgov/OSCAL/issues/1185 as a dependency and finish that before have Schematron that enforces documentation style we haven't formally documented yet?

aj-stein-nist avatar Aug 22 '22 14:08 aj-stein-nist

Sure. We could also include, or create follow-on tickets for, a couple of other items we have discussed:

  • updated CSS stylesheet for editing Metaschema in oXygen (I have one somewhere)
  • whitespace normalization (and maintenance) strategy for metaschema sources?

wendellpiez avatar Aug 22 '22 14:08 wendellpiez

Let's meet and float requirements, and spike a representation Schematron with relevant XPath queries to demonstrate the approach, decided to flesh and include more requirements. @aj-stein-nist will set up and coordinate the meeting.

aj-stein-nist avatar Oct 03 '22 13:10 aj-stein-nist

@Rene2mt, I am going to send you an email regarding ☝️. You are the only person for whom I cannot view a calendar, but will want to get a brief meeting on the agenda to make sure this does not slip through the cracks. Once I get that, I will book something for all of us.

aj-stein-nist avatar Oct 03 '22 21:10 aj-stein-nist

I tentatively scheduled a meeting and some pairing volunteered by Wendell on Friday. I will write some updates here on Friday.

aj-stein-nist avatar Oct 04 '22 20:10 aj-stein-nist

Example, with the m: prefix bound to Metaschema:

<sch:ns uri="http://csrc.nist.gov/ns/oscal/metaschema/1.0" prefix="m"/>

<sch:pattern>
    <sch:rule context="m:description">
        <sch:assert role="error" test="ends-with(.,'.')" id="description-ends-with-dot">Description should end with a period.</sch:assert>
        <sch:assert role="error" test="string-length(.) gt 6" id="description-long-enough">Description is too short.</sch:assert>
    </sch:rule>
</sch:pattern>

wendellpiez avatar Oct 11 '22 16:10 wendellpiez

We all met as a group and tried one of the tentative items for editorial or documentation requirements beyond capitalization. We will push up the PR as proof of spike, and either merge or close with the sprint and reopen accordingly with Dave after he returns from leave.

@Rene2mt had some ideas regarding the recommendation or enforcement of newer props around identifier and scope handling, for example https://github.com/usnistgov/OSCAL/blob/d3309d3f4c358b0acb8d4e7801bc0612d3bc652b/src/metaschema/oscal_metadata_metaschema.xml#L83-L87. He will post here, as will others for feedback, potentially to wrap up the spike for end of sprint.

Towards the end of the call, we experimented with SQF for adding periods to the rule in usnistgov/OSCAL#1501 and how to extend that for recommended prop addition re what Rene suggested, but we had trouble getting the period SQF fix working and I have not pushed that up. I may potentially do so later in the week.

aj-stein-nist avatar Oct 11 '22 17:10 aj-stein-nist

For unique identifier documentation in the Metaschemas, we should check the following:

  1. they contain prop fields, including "value-type", "identifier-type", "identifier-uniqueness", "identifier-scope", "identifier-persistence"
  2. for each prop, it should have the appropriate value
  3. a given identifier flag should only have unique props (no props with duplicate names)

Rene2mt avatar Oct 11 '22 17:10 Rene2mt

@Rene2mt this is great. Almost specific enough to code to.

Let me take a shot at a narrower spec:

  • For flag elements in a Metaschema instance determined to be IDs
    • (is this every flag named id or uuid? or those with a child prop[@name=value-type][@value='identifier']?)
    • for each of the values "value-type", "identifier-type", "identifier-uniqueness", "identifier-scope", "identifier-persistence", a single element child prop can be found with that as its @name
      • i.e. throw errors if missing or duplicated

This is good stuff, but its complexity also suggests to me we might do better with an honest "constraint definition" element applicable to such IDs.

Indeed, it strikes me this might be expressed as a type of constraint:

<constraint>
  <constrain-id type="human-oriented" uniqueness="instance" scope="cross-instance" persistence="per-subject"/>
</constraint>

If we had this instead of props, more would have to happen in back, but this would be cleaner and easier both to code and to validate, even in implementations.

Thoughts?

wendellpiez avatar Oct 12 '22 13:10 wendellpiez

I think we are good here but I am not sure how we further the impressions of the usefulness of the spiked code and move it forward? Thoughts.

I presume later this week we will opt to close, but wanted to consider how address some of @wendellpiez and @Rene2mt's discussion. As it stands, it was reorganized, but this not only can be used in the IDE, it is enforced in CI/CD and block PRs with the small lift change. We can now consider next steps on more useful rules to help develop near real-time feedback.

aj-stein-nist avatar Nov 01 '22 19:11 aj-stein-nist

@david-waltermire-nist and I had a moment to sync on this, earlier today.

Next up on this Issue, IIRC, is to finalize a punchlist of features we want for the present (in an authoring Schematron), see to it they are implemented, and put it to bed for now. We briefly discussed how this overlaps with other validations, for example under CI/CD, but how that should necessarily be a blocker.

We also spoke about the idea of introducing semantics into Metaschema that translate only into Schematron -- such as the constraint I propose above, however we also agreed this could also be in scope, assuming we put it on the punchlist of features we want.

Thoughts? I think experience will show where it is nice to have the rules enforced, whether in authoring, in CI/CD or both. I could work with someone on finalizing this feature set - maybe @aj-stein-nist assuming aligning with the CI/CD setup makes sense.

wendellpiez avatar Nov 01 '22 19:11 wendellpiez

Thoughts? I think experience will show where it is nice to have the rules enforced, whether in authoring, in CI/CD or both. I could work with someone on finalizing this feature set - maybe @aj-stein-nist assuming aligning with the CI/CD setup makes sense.

Can we set aside some time on Thursday to chat and sketch some of this out and put together some issues as a like a minimalish epic and the progression of tasks to the backlog?

aj-stein-nist avatar Nov 01 '22 19:11 aj-stein-nist

I am going to wait until we can discuss this to close this issue. That will give us time to identify any additional new issues to work.

david-waltermire avatar Nov 02 '22 14:11 david-waltermire

Thoughts? I think experience will show where it is nice to have the rules enforced, whether in authoring, in CI/CD or both. I could work with someone on finalizing this feature set - maybe @aj-stein-nist assuming aligning with the CI/CD setup makes sense.

Can we set aside some time on Thursday to chat and sketch some of this out and put together some issues as a like a minimalish epic and the progression of tasks to the backlog?

I dropped the ball on this. I just reviewed with Dave. For me: I need to set up a session with @david-waltermire-nist and @wendellpiez to plan out the next steps.

aj-stein-nist avatar Nov 22 '22 16:11 aj-stein-nist

Dave, Wendell, and I synced up. As we imagine future work, there are a few avenues of work, organized by category.

  • Guardrails for editiorial changes to documentation strings and prose in OSCAL metaschemas
    • [x] Example: description fields with a sentence end in a period. (already completed in the PR for this issue, https://github.com/usnistgov/OSCAL/pull/1501)
    • [ ] Leading and trailing space in formal names, descriptions, and relevant fields https://github.com/usnistgov/metaschema-xslt/issues/31
    • [ ] SPIKE: review and add additional editorial Schematronable checks after style guide completed in usnistgov/OSCAL#1185
  • Guardrails syntax/semantics/higher-level recommendations (warnings) and requirements (errors, fail build)
    • [ ] Check/enforce Metaschema definition props (i.e. machine-oriented or human-oriented kind of identifier and the reference scope, as discussed in https://github.com/usnistgov/OSCAL/issues/801#issuecomment-1275040742)
    • [ ] Data-type checking for conformance https://github.com/usnistgov/metaschema-xslt/issues/32
  • Documentation for developers why the build failed for Schematron, how to find details, how to fix manually (or potentially automated fixing in the developer IDE with Schematron QuickFix, if applicable) https://github.com/usnistgov/metaschema-xslt/issues/33
  • Stretch-goals, follow-on after this work:
    • [ ] Auto-correction in CI/CD with transforms https://github.com/usnistgov/metaschema-xslt/issues/34
    • [ ] Add CSS/styling for editing Metaschemas and stylizing presentation in Oxygen and/or resulting SVRL https://github.com/usnistgov/metaschema-xslt/issues/24

aj-stein-nist avatar Nov 23 '22 15:11 aj-stein-nist

I need to add these issues ☝️ in the evening to close this out as part of Sprint 60. I will not be moving this issue forward to Sprint 61, and will then ping Dave accordingly.

aj-stein-nist avatar Dec 05 '22 22:12 aj-stein-nist

@david-waltermire-nist as agreed, I am closing this now that I have set up the relevant issues to continue this work. Apologies for the delay.

aj-stein-nist avatar Dec 06 '22 16:12 aj-stein-nist