OSCAL
OSCAL copied to clipboard
Queries for schema documentation top-down polish
User Story:
Consistency and usage of language and terminology in documentation including editorial policy can to a certain degree be supported with Schematron. For example, a Schematron rule can say 'descriptions should not be full sentences, only phrases' and then detect periods or other forbidden punctuation. More generally (and more lightweight), queries such as //formal-name
will help us take a top-down view of the usage of this element.
Goals:
Develop and apply some simple queries to run over Metaschema source, to provide editorial support and improve the level of 'polish' of the design and resulting documents. Deploy them as either/both Schematron or (documented) XPath/XQuery.
Making the actual changes is not in scope for this Issue; it can either be done as part of regular editorial work, or spun into an Issue of its own. However, this Issue can be considered done as soon as we have some means of assurance that the docs have the needed consistency, with long-term support tooling being a spin-off goal.
Example rules (for discussion / tbd):
- formal names should be in title case (capitalized)
- list items should be constituted of phrasing, not sentences
- list items punctuated with
;
(?) - same with allowed-values/enum
- same with descriptions
- descriptions should not include parentheses/brackets? etc.
Dependencies:
None.
Acceptance Criteria
- [ ] All OSCAL website and readme documentation affected by the changes in this issue have been updated. Changes to the OSCAL website can be made in the docs/content directory of your branch.
- [ ] A Pull Request (PR) is submitted that fully addresses the goals of this User Story. This issue is referenced in the PR.
- [ ] The CI-CD build process runs without any reported errors on the PR. This can be confirmed by reviewing that all checks have passed in the PR.
{The items above are general acceptance criteria for all User Stories. Please describe anything else that must be completed for this issue to be considered resolved.}
Combining this with usnistgov/OSCAL#839, at least some of these items could be supported via Schematron.
This sounds interesting and beneficial (to me, personally). Should we not declare https://github.com/usnistgov/OSCAL/issues/1185 as a dependency and finish that before have Schematron that enforces documentation style we haven't formally documented yet?
Sure. We could also include, or create follow-on tickets for, a couple of other items we have discussed:
- updated CSS stylesheet for editing Metaschema in oXygen (I have one somewhere)
- whitespace normalization (and maintenance) strategy for metaschema sources?
Let's meet and float requirements, and spike a representation Schematron with relevant XPath queries to demonstrate the approach, decided to flesh and include more requirements. @aj-stein-nist will set up and coordinate the meeting.
@Rene2mt, I am going to send you an email regarding ☝️. You are the only person for whom I cannot view a calendar, but will want to get a brief meeting on the agenda to make sure this does not slip through the cracks. Once I get that, I will book something for all of us.
I tentatively scheduled a meeting and some pairing volunteered by Wendell on Friday. I will write some updates here on Friday.
Example, with the m: prefix bound to Metaschema:
<sch:ns uri="http://csrc.nist.gov/ns/oscal/metaschema/1.0" prefix="m"/>
<sch:pattern>
<sch:rule context="m:description">
<sch:assert role="error" test="ends-with(.,'.')" id="description-ends-with-dot">Description should end with a period.</sch:assert>
<sch:assert role="error" test="string-length(.) gt 6" id="description-long-enough">Description is too short.</sch:assert>
</sch:rule>
</sch:pattern>
We all met as a group and tried one of the tentative items for editorial or documentation requirements beyond capitalization. We will push up the PR as proof of spike, and either merge or close with the sprint and reopen accordingly with Dave after he returns from leave.
@Rene2mt had some ideas regarding the recommendation or enforcement of newer prop
s around identifier and scope handling, for example https://github.com/usnistgov/OSCAL/blob/d3309d3f4c358b0acb8d4e7801bc0612d3bc652b/src/metaschema/oscal_metadata_metaschema.xml#L83-L87. He will post here, as will others for feedback, potentially to wrap up the spike for end of sprint.
Towards the end of the call, we experimented with SQF for adding periods to the rule in usnistgov/OSCAL#1501 and how to extend that for recommended prop
addition re what Rene suggested, but we had trouble getting the period SQF fix working and I have not pushed that up. I may potentially do so later in the week.
For unique identifier documentation in the Metaschemas, we should check the following:
- they contain prop fields, including "value-type", "identifier-type", "identifier-uniqueness", "identifier-scope", "identifier-persistence"
- for each prop, it should have the appropriate value
- a given identifier flag should only have unique props (no props with duplicate names)
@Rene2mt this is great. Almost specific enough to code to.
Let me take a shot at a narrower spec:
- For
flag
elements in a Metaschema instance determined to be IDs- (is this every flag named
id
oruuid
? or those with a childprop[@name=value-type][@value='identifier']
?) - for each of the values "value-type", "identifier-type", "identifier-uniqueness", "identifier-scope", "identifier-persistence", a single element child
prop
can be found with that as its@name
- i.e. throw errors if missing or duplicated
- (is this every flag named
This is good stuff, but its complexity also suggests to me we might do better with an honest "constraint definition" element applicable to such IDs.
Indeed, it strikes me this might be expressed as a type of constraint:
<constraint>
<constrain-id type="human-oriented" uniqueness="instance" scope="cross-instance" persistence="per-subject"/>
</constraint>
If we had this instead of props, more would have to happen in back, but this would be cleaner and easier both to code and to validate, even in implementations.
Thoughts?
I think we are good here but I am not sure how we further the impressions of the usefulness of the spiked code and move it forward? Thoughts.
I presume later this week we will opt to close, but wanted to consider how address some of @wendellpiez and @Rene2mt's discussion. As it stands, it was reorganized, but this not only can be used in the IDE, it is enforced in CI/CD and block PRs with the small lift change. We can now consider next steps on more useful rules to help develop near real-time feedback.
@david-waltermire-nist and I had a moment to sync on this, earlier today.
Next up on this Issue, IIRC, is to finalize a punchlist of features we want for the present (in an authoring Schematron), see to it they are implemented, and put it to bed for now. We briefly discussed how this overlaps with other validations, for example under CI/CD, but how that should necessarily be a blocker.
We also spoke about the idea of introducing semantics into Metaschema that translate only into Schematron -- such as the constraint
I propose above, however we also agreed this could also be in scope, assuming we put it on the punchlist of features we want.
Thoughts? I think experience will show where it is nice to have the rules enforced, whether in authoring, in CI/CD or both. I could work with someone on finalizing this feature set - maybe @aj-stein-nist assuming aligning with the CI/CD setup makes sense.
Thoughts? I think experience will show where it is nice to have the rules enforced, whether in authoring, in CI/CD or both. I could work with someone on finalizing this feature set - maybe @aj-stein-nist assuming aligning with the CI/CD setup makes sense.
Can we set aside some time on Thursday to chat and sketch some of this out and put together some issues as a like a minimalish epic and the progression of tasks to the backlog?
I am going to wait until we can discuss this to close this issue. That will give us time to identify any additional new issues to work.
Thoughts? I think experience will show where it is nice to have the rules enforced, whether in authoring, in CI/CD or both. I could work with someone on finalizing this feature set - maybe @aj-stein-nist assuming aligning with the CI/CD setup makes sense.
Can we set aside some time on Thursday to chat and sketch some of this out and put together some issues as a like a minimalish epic and the progression of tasks to the backlog?
I dropped the ball on this. I just reviewed with Dave. For me: I need to set up a session with @david-waltermire-nist and @wendellpiez to plan out the next steps.
Dave, Wendell, and I synced up. As we imagine future work, there are a few avenues of work, organized by category.
- Guardrails for editiorial changes to documentation strings and prose in OSCAL metaschemas
- [x] Example: description fields with a sentence end in a period. (already completed in the PR for this issue, https://github.com/usnistgov/OSCAL/pull/1501)
- [ ] Leading and trailing space in formal names, descriptions, and relevant fields https://github.com/usnistgov/metaschema-xslt/issues/31
- [ ] SPIKE: review and add additional editorial Schematronable checks after style guide completed in usnistgov/OSCAL#1185
- Guardrails syntax/semantics/higher-level recommendations (warnings) and requirements (errors, fail build)
- [ ] Check/enforce Metaschema definition props (i.e. machine-oriented or human-oriented kind of identifier and the reference scope, as discussed in https://github.com/usnistgov/OSCAL/issues/801#issuecomment-1275040742)
- [ ] Data-type checking for conformance https://github.com/usnistgov/metaschema-xslt/issues/32
- Documentation for developers why the build failed for Schematron, how to find details, how to fix manually (or potentially automated fixing in the developer IDE with Schematron QuickFix, if applicable) https://github.com/usnistgov/metaschema-xslt/issues/33
- Stretch-goals, follow-on after this work:
- [ ] Auto-correction in CI/CD with transforms https://github.com/usnistgov/metaschema-xslt/issues/34
- [ ] Add CSS/styling for editing Metaschemas and stylizing presentation in Oxygen and/or resulting SVRL https://github.com/usnistgov/metaschema-xslt/issues/24
I need to add these issues ☝️ in the evening to close this out as part of Sprint 60. I will not be moving this issue forward to Sprint 61, and will then ping Dave accordingly.
@david-waltermire-nist as agreed, I am closing this now that I have set up the relevant issues to continue this work. Apologies for the delay.