ocsf-schema icon indicating copy to clipboard operation
ocsf-schema copied to clipboard

Attack Object - Add tactic_name in addition to tactic_id

Open Noafr opened this issue 2 years ago • 1 comments

Problem: The current Attack object describes the involved technique (uid [string], name [string]) related to the attack and its associated tactics. As one technique may be associated with multiple tactics, the tactics field is of type [String array]. The current implementation provides the IDs of the associated tactics, but not their names. Some vendors will be sharing the tactic associated with a given event and analysts would definitely want to be able to execute queries like: mitre_tactic = 'Discovery'. This data is sent by our ADSecure product for Active Directory Protection.

Noafr avatar Oct 17 '22 10:10 Noafr

Example (most values were intentionally removed):

{
"Alert_Level": 15,
"src-ip": "",
"mitre_tactic": [
"Discovery"
],
"reference_id": "",
"src-port": ,
"src_ep_guid": "",
"ep_aid": "0",
"subscriberId": "",
"Details": "Multiple AD Queries seen, Attacker UserName= X",
"Rule": ,
"src_hostname": "",
"ad_event_ids": "",
"Category": "Recon",
"Description": "AD Privilege Group Enumeration Detected",
"attivo_uid": "",
"Alert_Level_Str": "Very High",
"user_name": "",
"src_username_arr": [],
"syslog_program": "ACTIVE DIRECTORY",

"dest-port": ,
"feature": "ADSecure",
"src_os_name": "",
"att_domain": "",
"dest-ip": "",
"src_mac": "",
"mitre_tech": [
"Permission Groups Discovery"
],
"mitre_tech_id": [
"T1069"
],
"root_process_name": "",
"dest_os": "",
"@timestamp": "",
"product_type": "",
"dest_host": "",
}

Noafr avatar Oct 17 '22 10:10 Noafr

Are you suggesting that we have an array of String tuples for tactics? Since we are assuming there is only one technique (both uid and String form) per Attack object (and attacks is always an array wherever Attack is referenced), we would need a Tactic object most likely to represent the tuple, otherwise we would need a pair of ordered arrays (e.g. tactic_uids, tactic_names).

pagbabian-splunk avatar Oct 18 '22 18:10 pagbabian-splunk

@pagbabian-splunk and @Noafr having an array tactic object within the Attack object seems the most simple to me. The tactic object can contain the string name and uid tuple, and we can make either the name or the uid required. Any issues with that approach?

Aniak5 avatar Oct 19 '22 16:10 Aniak5

This is the subject of polls https://github.com/ocsf/ocsf-schema/discussions/243 and https://github.com/ocsf/ocsf-schema/discussions/313.

I do not believe data objects should contain id-name tuples. I agree with https://github.com/JasonKeirstead that sending both in a single data object is unnecessary redundancy. I disagree with Jason that one (tactic_id) should be mandatory. Data objects are focused toward an audience. The entire object shown above is oriented toward human consumption, so it makes no sense to single out a single enum (mitre_tactic) and require it to be either a numeric id (7) or an id/name tuple ([7, "Discovery]) -- the first is discordant - a single unreadable number in an object full of strings, and the second is redundant.

Instead, I believe the object as shown, with just the tactic name, is appropriate for human consumption.

A different object whose purpose is machine-to-machine optimized communication and storage, would use numeric IDs for all properties including enums. It would be much smaller and completely unreadable, like raw IP packets, but tools (like Wireshark for pcap decoding) would both translate it to the identical readable JSON object and display it for human consumption.

Correct:

{
  "Alert_Level": 15,
  "src-ip": "",
  "mitre_tactic": [
    "discovery"
  ]
}

Discordantly Mismatched:

{
  "Alert_Level": 15,
  "src-ip": "",
  "mitre_tactic": [
    7
  ]
}

Unnecessarily Redundant:

{
  "Alert_Level": 15,
  "src-ip": "",
  "mitre_tactic": [
    [7, "discovery"]
  ]
}

davaya avatar Oct 24 '22 17:10 davaya

@davaya Wouldn't you expect a user to be able to query for the MITRE tactic id = 'X'? Presenting a MITRE ID without providing the associate tactic name doesn't make much sense, however, it is appropriate for human consumption compared to other random id/name tuples)

Noafr avatar Oct 24 '22 17:10 Noafr

Maybe, depending on whether users normally use tactic IDs. The query UI would know the database and whether it stores IDs or names, and translate whatever is selected by the user to what is stored in the database.

A pick list could display both (like zip codes and city names, though that's a bad analogy because the name isn't 1:1 with id), but if people never use the tactic ID (as with Windows GUIDs) there's little reason to display IDs in the pick list even if it is used in the database.

Web browsers know the color id to name mapping and the user can type either one, but a .css file never contains a tuple of both BlanchedAlmond and #FFEBCD. If you want to search a text .css file with a color-unaware editor like notepad, you have to manually search for both if you want to catch them all.

Bottom line: don't complicate data structures to compensate for perceived tool limitations. Design simple data structures and use tools to create the desired user experience.

davaya avatar Oct 24 '22 18:10 davaya

If we look at the MITRE Enterprise matrix in their portal, they display names but when you hover over them, the tip shows the TA# or the T#. They are both used, and I think the issue here is what mode a consumer is in from an analysis standpoint. However, a producer cannot know about all possible consumers, hence it doesn't know what mode to favor when emitting the events. I think this is the argument for sending the tuple.

pagbabian-splunk avatar Oct 24 '22 21:10 pagbabian-splunk

Shouldn't it be a design requirement that OCSF consumers, upon receiving a TA#, can display the name, and upon receiving a name, can display the TA#? The same logic that says "TA3821904" is not a valid tactic says that it doesn't have a name, and if it is valid then it does.

My concept of "mode" is IETF media type - a producer can send a picture in image/jpg or image/png "mode", consumers support one or the other or both, and if the consumer supports both it doesn't matter which the producer sent. One is better quality, the other is smaller size, and a consumer can receive a png image and store it in jpg format if storage size is important to that consumer. If all media types are lossless (bmp, tiff, png, etc) images can be converted among them without loss. Same for OCSF events; all data formats must be lossless.

davaya avatar Oct 25 '22 10:10 davaya

I see this similarly but with different details. An OCSF producer must send a consistent ID and may send an associated name. A consumer can expect to receive an ID and may lookup a name if it isn't present. It can also populate (enrich) the schema ahead of storage, e.g. for stream processing purposes.

When we first designed the Attack object, we replicated the tactics (and maybe the techniques) as enums but we soon determined that we couldn't keep the schema in sync with changes in the MITRE matrix, so went with strings and assumed consumers had access to the MITRE matrix. We couldn't validate them in the earlier schema however - a consumer would need to do that.

Therefore I propose that we define a Tactic object and a Technique object, each having a required ID field, and an optional name field. The Attack object would then have two required attributes: tactics as an array of Tactic objects, and technique as a Technique object. A producer can optionally populate the name, or the consumer can optionally lookup the name and populate it, or simply store the name.

We could stay with the siblings technique_uid and technique_name instead of a Technique object, making technique_name optional, but it's more consistent if they are both objects.

pagbabian-splunk avatar Oct 26 '22 01:10 pagbabian-splunk

Add version to the attack object

rroupski avatar Oct 26 '22 18:10 rroupski

Yes, per @tankbusta and @awhite456 different versions of the ATT&CK matrix elements can coexist.

pagbabian-splunk avatar Oct 26 '22 18:10 pagbabian-splunk

We could define a Tactic object and a Technique object, each having a required string and an optional string. But without a schema there is no way to validate them: ["TA0007", "Discovery"] is valid, as are ["Discovery", "TA0007"], ["Discovery"], ["TA999345"], and ["Thursday", "Octopus"].

A schema would allow them to be validated, and could be automatically generated from the MITRE STIX sources https://github.com/mitre/cti/releases/tag/ATT%26CK-v12.0. The problem is that MITRE doesn't define any integer ids, so OCSF would need to create them, then create two enums for each tactic and technique, one mapping the OCSF integer ID to the MITRE "external_id" ("TA0001") and the other mapping the OCSF integer to the MITRE "shortname": ("initial-access") or "name": ("Initial Access"). It sure would be easier / more standard if MITRE had defined their own integer object "id", but they didn't. That doesn't prevent OCSF from creating those ids.

{
    "type": "bundle",
    "id": "bundle--c385d941-b6fa-49b6-b9a5-1d6ba4964473",
    "spec_version": "2.0",
    "objects": [
        {
            "x_mitre_domains": [
                "enterprise-attack"
            ],
            "object_marking_refs": [
                "marking-definition--fa42a846-8d90-4e51-bc29-71d5b4802168"
            ],
            "id": "x-mitre-tactic--ffd5bcee-6e16-4dd2-8eca-7b3beedf33ca",
            "type": "x-mitre-tactic",
            "created": "2018-10-17T00:14:20.652Z",
            "created_by_ref": "identity--c78cb6e5-0c4b-4611-8297-d1b8b55e40b5",
            "external_references": [
                {
                    "external_id": "TA0001",
                    "url": "https://attack.mitre.org/tactics/TA0001",
                    "source_name": "mitre-attack"
                }
            ],
            "modified": "2019-07-19T17:41:41.425Z",
            "name": "Initial Access",
            "description": "The adversary is trying to get into your network.\n\nInitial Access consists of techniques that use various entry vectors to gain their initial foothold within a network. Techniques used to gain a foothold include targeted spearphishing and exploiting weaknesses on public-facing web servers. Footholds gained through initial access may allow for continued access, like valid accounts and use of external remote services, or may be limited-use due to changing passwords.",
            "x_mitre_version": "1.0",
            "x_mitre_attack_spec_version": "2.1.0",
            "x_mitre_modified_by_ref": "identity--c78cb6e5-0c4b-4611-8297-d1b8b55e40b5",
            "x_mitre_shortname": "initial-access"
        }
    ]
}

davaya avatar Oct 26 '22 20:10 davaya

Wow, I took a quick look at the github repo and there is a lot to look at across the matrix. For validation, we could compromise and consider the range of techniques and tactics, requiring the 'T' and the 'TA' followed by integers and the integers would need to be in the range of the version of MITRE in the Attack object. It's not perfect, as the techniques might not be part of the tactics (and the sub-techniques might be out of range as those are more unwieldy but it would catch some of your errors above. Given the name field would be optional, it is less of an issue from a validation standpoint (but still not great if sent by a producer).

pagbabian-splunk avatar Oct 27 '22 00:10 pagbabian-splunk

See PR https://github.com/ocsf/ocsf-schema/pull/326

rroupski avatar Oct 27 '22 17:10 rroupski