datadog-agent Add ability to identify interpreter of a script

What does this PR do?

Adds ability to identify the interpreter of a script inside a script via the shebang. This is currently limited to one layer of nested script. Example rule would be exec.interpreter.file.name == ~"python*"

Motivation

Currently, when a script like the below is run, we can identify that it was a bash script, but cannot identify that a python script was also run. This PR adds the ability to identify that Python was run.

`#!/bin/bash

echo "Executing echo inside a bash script"

cat << EOF > pyscript.py
#!/usr/bin/python3

print('Executing print inside a python script')

EOF

echo "Back to bash"

chmod 755 pyscript.py
./pyscript.py`

Additional Notes

The current implementation does not expose the interpreter of any script that's nested more than 1 layer. For example, in the following script, perl cannot be identified at the rule level yet.

echo "Executing echo inside a bash script"

cat << '__HERE__' > hello.pl
#!/usr/bin/perl

my $foo = "Hello from Perl";
print "$foo\n";

__HERE__

chmod 755 hello.pl

cat << EOF > pyscript.py
#!/usr/bin/python3

import subprocess

print('Executing print inside a python script')

subprocess.run(["perl", "./hello.pl"])

EOF

echo "Back to bash"

chmod 755 pyscript.py
./pyscript.py

Possible Drawbacks / Trade-offs

Because I'm only sending the inode, mount id, and path id to userspace, the documented fields like

| `exec.interpreter.file.rights` | int | Mode/rights of the file | Chmod mode constants |
| `exec.interpreter.file.uid` | int | UID of the file's owner |  |
| `exec.interpreter.file.user` | string | User of the file's owner |  |

are actually not available. This can be confusing for the user, so i'm considering sending the whole file information, even if it adds to the size of the data.

Describe how to test/QA your changes

See TestProcessIdentifyInterpreter for examples.

Reviewer's Checklist

[x] If known, an appropriate milestone has been selected; otherwise the Triage milestone is set.
[ ] Use the major_change label if your change either has a major impact on the code base, is impacting multiple teams or is changing important well-established internals of the Agent. This label will be use during QA to make sure each team pay extra attention to the changed behavior. For any customer facing change use a releasenote.
[x] A release note has been added or the changelog/no-changelog label has been applied.
[x] Changed code has automated tests for its functionality.
[x] Adequate QA/testing plan information is provided if the qa/skip-qa label is not applied.
[x] At least one team/.. label has been applied, indicating the team(s) that should QA this change.
[ ] If applicable, docs team has been notified or an issue has been opened on the documentation repo.
[ ] If applicable, the need-change/operator and need-change/helm labels have been applied.
[ ] If applicable, the k8s/<min-version> label, indicating the lowest Kubernetes version compatible with this feature.
[ ] If applicable, the config template has been updated.

Jul 26 '22 20:07 modernplumbing

@modernplumbing I know it's a draft PR but do you mind updating the title of this PR to indicate what it's trying to do? Since it's a live PR, everyone watching this repo is being notified of its pushes which come in titled with the current PR title.

Aug 04 '22 18:08 sgnn7

@modernplumbing I know it's a draft PR but do you mind updating the title of this PR to indicate what it's trying to do? Since it's a live PR, everyone watching this repo is being notified of its pushes which come in titled with the current PR title.

Sorry about that!

Aug 05 '22 02:08 modernplumbing

datadog-agent datadog-agent copied to clipboard

Add ability to identify interpreter of a script

What does this PR do?

Motivation

Additional Notes

Possible Drawbacks / Trade-offs

Describe how to test/QA your changes

Reviewer's Checklist

datadog-agent
datadog-agent copied to clipboard