ocsf-schema icon indicating copy to clipboard operation
ocsf-schema copied to clipboard

Linting controls

Open dkolbly opened this issue 1 year ago • 8 comments

Related Issue:

#1061 Support linting of enum and sibling conventions

Description of changes:

  • Adds a suppress_checks option to the metaschema to configure turning off certain linting rules
  • Turns off those linting checks for places where we have violated the conventions (there are about 3)
  • Fixes data_lifecycle_state_id to follow the enum convention by adding a 99 (Other) enumerand and articulating that it should be used for "other"

dkolbly avatar Apr 24 '24 19:04 dkolbly

I will note that the validator in ocsf-validator does not yet understand lint and hence is failing against this PR. If discussion proceeds, I can probably address that issue as well at least at a surface level.

dkolbly avatar Apr 24 '24 23:04 dkolbly

The ocsf/ocsf-validator needs a change to support the new lint attribute. I believe the change will be in ocsf_validator/types.py, adding a lint key to OcsfAttr:

OcsfAttr = TypedDict(
    "OcsfAttr",
    {
        "$include": NotRequired[str],
        # "caption": NotRequired[str],
        "caption": str,
        "default": NotRequired[Any],
        "description": NotRequired[str],
        "enum": NotRequired[Dict[str, OcsfEnumMember]],
        "group": NotRequired[str],
        "is_array": NotRequired[bool],
        "lint": NotRequired[Sequence[str]],
        "max_len": NotRequired[int],
        "name": NotRequired[str],
        "notes": NotRequired[str],
        "observable": NotRequired[int],
        "range": NotRequired[Sequence[int]],
        "regex": NotRequired[str],
        "requirement": NotRequired[str],
        "sibling": NotRequired[str],
        "type": NotRequired[str],
        "type_name": NotRequired[str],
        "profile": NotRequired[Optional[Sequence[str]]],
        "values": NotRequired[Sequence[Any]],
        "@deprecated": NotRequired[OcsfDeprecationInfo],
    },
)

(I say "I believe..." because I haven't tried this change against the modified schema myself.)

rmouritzen-splunk avatar May 01 '24 19:05 rmouritzen-splunk

I will try it out!

dkolbly avatar May 02 '24 12:05 dkolbly

Do we have rules for naming conventions for fields like this? Thinking about:

  • We use a $ prefix for references to other fields (which is similar to JSON schema's conventions, as well as comments)
  • We also have a @ prefix in @deprecated as an annotation for the schema. I don't know the rationale behind that prefix, but it reminds me of a Python decorator (such as @deprecated here).

Should lint follow one of these conventions (@lint or $lint) since it's more of a validation directive or annotation than something that influences the final schemas themselves?

alanisaac avatar May 07 '24 17:05 alanisaac

@alanisaac great question... I don't have an opinion on that, really. There are lots of other properties that directed at the compiler (i.e., that do not appear in the produced concrete schema, e.g., JSONSchema) like extends and the top level name property in files, so I didn't think anything of using an undecorated property name for lint. The notion of linting the schema itself is new, so I'm not sure where else to look for a comparative example...

dkolbly avatar May 07 '24 17:05 dkolbly

I see a couple things:

  1. The suppressions use skewer-case instead of snake_case. Perhaps they should be snake_case for consistency.
  2. It's not clear to me what check each is actually suppressing. Can these be described somewhere?

(I'm quite interested because I'm adding a new event validation API, and I'll want to add support for these suppressions to that API.)

rmouritzen-splunk avatar Jun 04 '24 18:06 rmouritzen-splunk

@rmouritzen-splunk I'm not sure exactly where to write this down, but the intention is that both of these checks represent the conventions described here: https://github.com/ocsf/ocsf-docs/blob/main/Understanding%20OCSF.md#attributes

Specfically, enum_convention:

By convention, every Enum type has two common values with integer value 0 for Unknown and 99 for Other.

and for sibling_convention:

The sibling string attribute has the same name, minus the suffix. as well as from https://github.com/ocsf/ocsf-docs/blob/main/Understanding%20OCSF.md#unique-ids Certain schema-unique attributes that also have a friendly name or caption have the same prefix but by convention use the _name suffix

dkolbly avatar Jun 18 '24 17:06 dkolbly

I'm not sure exactly where to write this down

Is it possible to add this in the metaschema's JSON Schema itself?

rmouritzen-splunk avatar Jun 21 '24 21:06 rmouritzen-splunk