dissect.target icon indicating copy to clipboard operation
dissect.target copied to clipboard

Auth Log format update on rfc5424

Open damasch opened this issue 2 years ago • 8 comments

  1. Preamble

    The basic format for syslog messages in the "/var/log/auth.log" file follows the standard syslog format.

    <Timestamp> <Hostname> <Service>: <Message>
    

    The following pattern applies specifically to the auth.log file.

    <Timestamp> <Hostname> <Service>[<process_id>]: <Message>
    

    See: https://datatracker.ietf.org/doc/html/rfc5424#section-6

    1. Test data

      Mar 29 10:43:01 ip-10-77-20-248 sshd[1193]: Accepted password for test_user from REDACTED IP port 52942 ssh2
      Mar 27 13:06:56 ip-10-77-20-248 sshd[1291]: Server listening on REDACTED IP port 22.
      Mar 27 13:06:56 ip-10-77-20-248 systemd-logind[1118]: Watching system buttons on /dev/input/event0 (Power Button)
      Mar 27 13:06:56 ip-10-77-20-248 systemd-logind[1118]: Watching system buttons on /dev/input/event1 (Sleep Button)
      Mar 27 13:06:56 ip-10-77-20-248 systemd-logind[1118]: New seat seat0.
      Mar 27 13:08:09 ip-10-77-20-248 sshd[1361]: Accepted publickey for test_user from REDACTED IP port 12345 ssh2: RSA SHA256:123456789asdfghjklöertzuio
      Mar 27 13:08:09 ip-10-77-20-248 sshd[1361]: pam_unix(sshd:session): session opened for user test_user by (uid=0)
      Mar 27 13:08:09 ip-10-77-20-248 systemd: pam_unix(systemd-user:session): session opened for user test_user by (uid=0)
      Mar 27 13:10:08 ip-10-77-20-248 sudo:   ubuntu : TTY=pts/0 ; PWD=/home/test_user ; USER=root ; COMMAND=/usr/bin/apt-key add -
      Mar 27 13:17:01 ip-10-77-20-248 CRON[2623]: pam_unix(cron:session): session closed for user root
      Mar 27 13:23:11 ip-10-77-20-248 sudo: pam_unix(sudo:session): session opened for user root by test_user(uid=0)
      
  2. Problem with the current config

    The config of the regex pattern is declared in:

    https://github.com/fox-it/dissect.target/blob/main/dissect/target/plugins/os/unix/log/auth.py#L18

    But if acquire or target runs. The jsonl export deliver a pattern/object with:

    • hostname

    • domain

    • ts

    • message

    • source

      The service and the process id is in this version part of the message.

      It would be better if the service (and process id) and the message will be seperated.

      Example for this line:

      Mar 29 10:43:01 ip-10-77-20-248 sshd[1193]: Accepted password for test_user from REDACTED IP port 52942 ssh2
      

      Declared is this standard in: https://datatracker.ietf.org/doc/html/rfc3164#section-5.4. Ses also: https://www.loggly.com/ultimate-guide/linux-logging-basics/

  3. Possible Solution

    Try this pattern to the auth.log file and you will get the service and the process seperated from the message.

    ( w{3} s+ d+ s+ d{2}: d{2}: d{2}) s+( S+) s+([0-9A-Za-z -])? [?( w)? ]?: s+(.+)

    devided by groups

    To access the data use:

    _AUTH_LOG_REGEX = r"^(\w{3}\s+\d+\s+\d{2}:\d{2}:\d{2})\s+(\S+)\s+([0-9A-Za-z\-]**)?\[?(\w**)?\]?:\s+(.+)"
    
    RE_ALRE = re.compile(_AUTH_LOG_REGEX)
    ...
        auth_log_object = re.search(RE_ALRE, line)
        if auth_log_object:
          print(auth_log_object[1])
          print(auth_log_object[2])
          print(auth_log_object[3])
          print(auth_log_object[4])
    
  4. Upgrade for the code

    Try this example and use named subgroups for the regex.

    
    import re
    
    auth_log_content = """
    Mar 29 10:43:01 my_unix_host sshd[1193]: Accepted password for test_user from REDACTED IP port 52942 ssh2
    Mar 27 13:06:56 my_unix_host sshd[1291]: Server listening on REDACTED IP port 22.
    Mar 27 13:06:56 my_unix_host systemd-logind[1118]: Watching system buttons on /dev/input/event0 (Power Button)
    Mar 27 13:06:56 my_unix_host systemd-logind[1118]: Watching system buttons on /dev/input/event1 (Sleep Button)
    Mar 27 13:06:56 my_unix_host systemd-logind[1118]: New seat seat0.
    Mar 27 13:08:09 my_unix_host sshd[1361]: Accepted publickey for test_user from REDACTED IP port 12345 ssh2: RSA SHA256:123456789asdfghjklöertzuio
    Mar 27 13:08:09 my_unix_host sshd[1361]: pam_unix(sshd:session): session opened for user test_user by (uid=0)
    Mar 27 13:08:09 my_unix_host systemd: pam_unix(systemd-user:session): session opened for user test_user by (uid=0)
    Mar 27 13:10:08 my_unix_host sudo:   ubuntu : TTY=pts/0 ; PWD=/home/test_user ; USER=root ; COMMAND=/usr/bin/apt-key add -
    Mar 27 13:17:01 my_unix_host CRON[2623]: pam_unix(cron:session): session closed for user root
    Mar 27 13:23:11 my_unix_host sudo: pam_unix(sudo:session): session opened for user root by test_user(uid=0)
    """
    
    _TS_REGEX = r"^[A-Za-z]{3}\s\*[0-9]{1,2}\s[0-9]{1,2}:[0-9]{2}:[0-9]{2}"
    RE_TS = re.compile(_TS_REGEX)
    RE_TS_AND_HOSTNAME = re.compile(_TS_REGEX + r"\s\S+\s")
    
    
    
    for line in auth_log_content.split("\n"):
      ts_and_hostname = re.search(RE_TS_AND_HOSTNAME, line)
      if ts_and_hostname:
        message = line.replace(ts_and_hostname.group(0), "").strip()
        print(ts_and_hostname.group(0), message)
    
    
    _AUTH_LOG_REGEX = r"^(\w{3}\s+\d+\s+\d{2}:\d{2}:\d{2})\s+(\S+)\s+([0-9A-Za-z\-]**)?\[?(\w**)?\]?:\s+(.+)"
    
    RE_ALRE = re.compile(_AUTH_LOG_REGEX)
    
    for line in auth_log_content.split("\n"):
    
      auth_log_object = re.search(RE_ALRE, line)
    
      if auth_log_object:
        print()
        print(line)
        print(auth_log_object[1])
        print(auth_log_object[2])
        print(auth_log_object[3])
        print(auth_log_object[4])
        print(auth_log_object[5])
    
    
    
    _AUTH_LOG_NAMED_REGEX = r"^(?P<timestamp>\w{3}\s+\d+\s+\d{2}:\d{2}:\d{2})\s+(?P<hostname>\S+)\s+(?P<service>[0-9A-Za-z\-]**)?\[?(?P<processid>\w**)?\]?:\s+(?P<message>.+)"
    
    RE_ALNRE = re.compile(_AUTH_LOG_NAMED_REGEX)
    
    for line in auth_log_content.split("\n"):
    
      auth_log_object = re.match(RE_ALNRE, line)
    
      if auth_log_object:
        details = auth_log_object.groupdict()
        print(details)
    

    Output

    {'timestamp': 'Mar 29 10:43:01', 'hostname': 'my_unix_host', 'service': 'sshd', 'processid': '1193', 'message': 'Accepted password for test_user from REDACTED IP port 52942 ssh2'}
    {'timestamp': 'Mar 27 13:06:56', 'hostname': 'my_unix_host', 'service': 'sshd', 'processid': '1291', 'message': 'Server listening on REDACTED IP port 22.'}
    {'timestamp': 'Mar 27 13:06:56', 'hostname': 'my_unix_host', 'service': 'systemd-logind', 'processid': '1118', 'message': 'Watching system buttons on /dev/input/event0 (Power Button)'}
    {'timestamp': 'Mar 27 13:06:56', 'hostname': 'my_unix_host', 'service': 'systemd-logind', 'processid': '1118', 'message': 'Watching system buttons on /dev/input/event1 (Sleep Button)'}
    {'timestamp': 'Mar 27 13:06:56', 'hostname': 'my_unix_host', 'service': 'systemd-logind', 'processid': '1118', 'message': 'New seat seat0.'}
    {'timestamp': 'Mar 27 13:08:09', 'hostname': 'my_unix_host', 'service': 'sshd', 'processid': '1361', 'message': 'Accepted publickey for test_user from REDACTED IP port 12345 ssh2: RSA SHA256:123456789asdfghjklöertzuio'}
    {'timestamp': 'Mar 27 13:08:09', 'hostname': 'my_unix_host', 'service': 'sshd', 'processid': '1361', 'message': 'pam_unix(sshd:session): session opened for user test_user by (uid=0)'}
    {'timestamp': 'Mar 27 13:08:09', 'hostname': 'my_unix_host', 'service': 'systemd', 'processid': '', 'message': 'pam_unix(systemd-user:session): session opened for user test_user by (uid=0)'}
    {'timestamp': 'Mar 27 13:10:08', 'hostname': 'my_unix_host', 'service': 'sudo', 'processid': '', 'message': 'ubuntu : TTY=pts/0 ; PWD=/home/test_user ; USER=root ; COMMAND=/usr/bin/apt-key add -'}
    {'timestamp': 'Mar 27 13:17:01', 'hostname': 'my_unix_host', 'service': 'CRON', 'processid': '2623', 'message': 'pam_unix(cron:session): session closed for user root'}
    {'timestamp': 'Mar 27 13:23:11', 'hostname': 'my_unix_host', 'service': 'sudo', 'processid': '', 'message': 'pam_unix(sudo:session): session opened for user root by test_user(uid=0)'}
    

    So I prefer the named subgroups. But this is maybe not necessary for the mapping in the yield part.

damasch avatar Jun 21 '23 14:06 damasch

Thank you for the extensive description and suggestion! Sounds sensible to me.

@JSCU-CNI since you initially wrote the auth.log parser, perhaps you can comment on if you agree?

Schamper avatar Jun 22 '23 11:06 Schamper

This seems like a good improvement of the auth.log parser indeed :+1:

JSCU-CNI avatar Jul 05 '23 12:07 JSCU-CNI

Great! @damasch would you like to create a PR for this yourself?

Schamper avatar Jul 25 '23 11:07 Schamper

@Schamper Yes but how? I'm not a contributer to create a branch? Or should I fork this project?

I write a patch for this plugin. I'ts a suggestion:

import re
from itertools import chain
from typing import Iterator

from dissect.target.helpers.record import TargetRecordDescriptor
from dissect.target.helpers.utils import year_rollover_helper
from dissect.target.plugin import Plugin, export

AuthLogRecord = TargetRecordDescriptor(
    "linux/log/auth",
    [
        ("datetime", "ts"),
        ("string", "service"),
        ("string", "processid"),
        ("string", "message"),
        ("string", "user"),
        ("string", "remoteip"),
        ("string", "port"),
        ("string", "authservice"),
        ("string", "protocol"),
        ("string", "encryption"),
        ("string", "method"),
        ("string", "key"),
        ("string", "misc"),
        ("string", "sessionmode"),
        ("string", "userid"),
        ("string", "useridassociate"),
        ("string", "tty"),
        ("string", "pwd"),
        ("string", "usereffective"),
        ("string", "command"),
        ("string", "cron"),
        ("string", "conectionmode"),
        ("path", "source"),
    ],
)

# Timestamp Regex
_TS_REGEX = r"^[A-Za-z]{3}\s*[0-9]{1,2}\s[0-9]{1,2}:[0-9]{2}:[0-9]{2}"
RE_TS = re.compile(_TS_REGEX)
RE_TS_AND_HOSTNAME = re.compile(_TS_REGEX + r"\s\S+\s")


# New regex pattern for getting the log entries
# Mar 29 10:43:01 my_unix_host ...
RE_ENTRY = re.compile(r'^\w{3}\s+\d+\s+\d{2}:\d{2}:\d{2}\s+\S+\s+(?P<service>[0-9A-Za-z\-\-\(\)=]*)?\[?(?P<processid>\w*)?\]?:\s+(?P<message>.+)')

# New regex pattern for interpreting the SSH message "Accepted password"
# Mar 29 10:43:01 my_unix_host sshd[1193]: Accepted password for test_user from 127.0.0.1 port 52942 ssh2
RE_SSH_ACCEPTED_PASSWORD = re.compile(r'^Accepted\spassword\sfor\s(?P<user>[\S\_]+)\sfrom\s(?P<remoteip>[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}|[0-9a-zA-Z]{0,4}:[0-9a-zA-Z]{0,4}:[0-9a-zA-Z]{0,4}:[0-9a-zA-Z]{0,4}:[0-9a-zA-Z]{0,4}:[0-9a-zA-Z]{0,4}:[0-9a-zA-Z]{0,4})\sport\s(?P<port>[0-9]{1,5})\s(?P<authservice>\w+)$')

# New regex pattern for interpreting the SSH message "Accepted publickey"
# Accepted publickey for test_user from 123.123.123.123 port 12345 ssh2: RSA SHA256:123456789asdfghjklöertzuio
RE_SSH_ACCEPTED_PUBLICKEY = re.compile(r'^Accepted\spublickey\sfor\s(?P<user>\S+)\sfrom\s(?P<remoteip>[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}|[0-9a-zA-Z]{0,4}:[0-9a-zA-Z]{0,4}:[0-9a-zA-Z]{0,4}:[0-9a-zA-Z]{0,4}:[0-9a-zA-Z]{0,4}:[0-9a-zA-Z]{0,4}:[0-9a-zA-Z]{0,4})\sport\s(?P<port>[0-9]{1,5})\s(?P<protocol>\w+):\s(?P<encryption>[\w\-]+)\s(?P<method>[\w]+):(?P<key>[\w:]+)(?P<misc>.*)$')

# New regex pattern for interpreting the PAM UNIX message "ssh"
# Jul  5 13:20:15 test-VirtualBox sudo: pam_unix(sudo:session): session opened for user root(uid=0) by test(uid=0)
RE_SSH_PAM_UNIX = re.compile(r'pam_unix\((?P<authservice>sshd):.*\): +session +(?P<sessionmode>closed|opened)\sfor\suser\s(?P<user>\w+)(?:\(uid=(?P<useridassociate>\w+)\))?(?:\sby\s)?(?:\(uid=(?P<userid>\w+)\))?$')

# New regex pattern for interpreting the connection mode on close message "ssh"
# Mar 29 17:07:19 my_unix_host sshd[4649]: Connection closed by 85.245.107.41 port 54790 [preauth]
RE_SSH_CONNECTION = re.compile(r'^Connection\s(?P<conectionmode>closed)\sby\s(?P<remoteip>[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}|[0-9a-zA-Z]{0,4}:[0-9a-zA-Z]{0,4}:[0-9a-zA-Z]{0,4}:[0-9a-zA-Z]{0,4}:[0-9a-zA-Z]{0,4}:[0-9a-zA-Z]{0,4}:[0-9a-zA-Z]{0,4})\sport\s(?P<port>[0-9]{1,5})$(?P<misc>.*)$')

# New regex pattern for interpreting the command from message "sudo"
# Jul  4 17:10:18 my_unix_host sudo:     test : TTY=pts/0 ; PWD=/home/test ; USER=root ; COMMAND=/usr/bin/apt update
RE_SUDO_COMMAND = re.compile(r'^(?P<user>\w+)\s:\sTTY=(?P<tty>\w+\/\w+)\s;\sPWD=(?P<pwd>[\/\w]+)\s;\sUSER=(?P<usereffective>\w+)\s;\sCOMMAND=(?P<command>.+)$')

# New regex pattern for interpreting the session message "cron"
# Mar 29 17:07:25 my_unix_host sshd[4651]: pam_unix(sshd:session): session opened for user ubuntu by (uid=0)
RE_CRON_PAM_UNIX = re.compile(r'^pam_unix\(cron:(?P<cron>.*)\): +session +(?P<sessionmode>closed|opened)\sfor\suser\s(?P<user>\w+)(?:\(uid=(?P<useridassociate>\w+)\))?(?:\sby\s)?(?:\(uid=(?P<userid>\w+)\))?$')

class AuthPlugin(Plugin):
    def check_compatible(self) -> bool:
        var_log = self.target.fs.path("/var/log")
        return any(var_log.glob("auth.log*")) or any(var_log.glob("secure*"))

    def apply_regex_on_message(self, pattern, messsage):
        """Return a data object with the data from the group dict regex"""
        data = {}
        
        message_data = re.match(pattern, messsage)
        if message_data:
            data.update(message_data.groupdict())

        return data

    @export(record=[AuthLogRecord])
    def securelog(self) -> Iterator[AuthLogRecord]:
        """Return contents of /var/log/auth.log* and /var/log/secure*."""
        return self.authlog()

    @export(record=[AuthLogRecord])
    def authlog(self) -> Iterator[AuthLogRecord]:
        """Return contents of /var/log/auth.log* and /var/log/secure*."""

        # Assuming no custom date_format template is set in syslog-ng or systemd (M d H:M:S)
        # CentOS format: Jan 12 13:37:00 hostname daemon: message
        # Debian format: Jan 12 13:37:00 hostname daemon[pid]: pam_unix(daemon:session): message

        tzinfo = self.target.datetime.tzinfo

        var_log = self.target.fs.path("/var/log")
        for auth_file in chain(var_log.glob("auth.log*"), var_log.glob("secure*")):
            for ts, line in year_rollover_helper(auth_file, RE_TS, "%b %d %H:%M:%S", tzinfo):
                data = {
                    'ts': ts,
                }

                log_entry = re.match(RE_ENTRY, line)
                if not log_entry:
                    self.target.log.warning("Log entry does not match with pattern %s.", auth_file)
                    self.target.log.warning("Skipping this line: %s", line)
                    continue
                
                data.update(log_entry.groupdict())

                data.update(self.apply_regex_on_message(RE_SSH_ACCEPTED_PASSWORD, log_entry["message"]))
                data.update(self.apply_regex_on_message(RE_SSH_ACCEPTED_PUBLICKEY, log_entry["message"]))
                data.update(self.apply_regex_on_message(RE_SSH_PAM_UNIX, log_entry["message"]))
                data.update(self.apply_regex_on_message(RE_SSH_CONNECTION, log_entry["message"]))
                data.update(self.apply_regex_on_message(RE_SUDO_COMMAND, log_entry["message"]))
                data.update(self.apply_regex_on_message(RE_CRON_PAM_UNIX, log_entry["message"]))

                data.update({
                    'source': auth_file,
                    '_target': self.target
                })
                yield AuthLogRecord(
                    **data
                    )


damasch avatar Jul 28 '23 09:07 damasch

You can fork the project, commit/push to your own fork and create a PR to this project. Doing it that way also makes sure you get properly credited for your contribution.

If you would like us to make these changes instead, just say the word :smile:

Schamper avatar Jul 28 '23 11:07 Schamper

@Schamper is your team (planning on) working on this?

JSCU-CNI avatar Sep 05 '24 10:09 JSCU-CNI

Hi @JSCU-CNI yes we do. As of now, it is planned to go in Dissect release 3.17.

EinatFox avatar Sep 05 '24 15:09 EinatFox

Thanks for the update :)

JSCU-CNI avatar Sep 05 '24 15:09 JSCU-CNI