velociraptor-docs icon indicating copy to clipboard operation
velociraptor-docs copied to clipboard

Create Windows.Memory.Mem2Disk artifact

Open lautarolecumberry opened this issue 1 month ago • 25 comments

This artifact compares executables in memory (RAM) with those on hard disk. This way, RAM injections are detected. This rarely happens legitimately and is mostly used by malware. This check is executed without dumping the memory and works live on the target system(s).

lautarolecumberry avatar Oct 04 '25 21:10 lautarolecumberry

CLA assistant check
All committers have signed the CLA.

CLAassistant avatar Oct 04 '25 21:10 CLAassistant

Does this take into account relocations? on quick read it does not so it is unlikely to work.

scudette avatar Oct 04 '25 21:10 scudette

Thanks for your feedback!

Which relocations are you refering to? ASLR?

The main RAM relocations we could find were due to 'BaseOfData'. The IgnoreOneByteOffsets parameter takes care of them. In our tests (on standard Windows systems) we did not encounter other relocations that affected the technique.

As for if it works: Lautaro wrote a master thesis on it and tested 54 samples (38 malicious / 16 benign). After the thesis we improved the false positive rate due to 'BaseOfData' and retested against 34 malware samples including three C2 frameworks (sliver, mythic, havoc) and 18 normal programs. All three C2 injections were detected.

Here are the results of the retests:

Not-detected Detected Total
Non-malware 33% (17) 2% (1) 35% (18)
Malware 19% (10) 46% (24) 65% (34)
Total 52% (27) 48% (25) 100% (52)

Detection rate is 96.0% Sensitivity is 70.6% Accuracy is 78.8%

Thesis and original code is here: https://github.com/lautarolecumberry/DetectingFilelessMalware

We're currently working on a blog post about the improvements. The submitted code already includes the 'BaseOfData' improvements (the thesis does not).

If you specify which relocations you are referring to, we're happy to have a look and improve the technique further :blush:

mdenzel avatar Oct 04 '25 22:10 mdenzel

I was thinking of the relocations needed when the binary is not built with position independent code (pic)

https://0xrick.github.io/win-internals/pe7/

In that case the binary image in memory will be different from the image on disk due to addresses being relocated by the loader.

Maybe it's not that common to have non pic binaries any more. Perhaps the artifact needs to flag that though for the analyst to ignore the results in this case

scudette avatar Oct 04 '25 23:10 scudette

We could probably also make the powershell for CheckOneByteChanges native VQL and do the comparisons in memory to not write .mem and .disk as a tempfiles. Maybe thats a v2 though :)

mgreen27 avatar Oct 04 '25 23:10 mgreen27

Dear @scudette & @mgreen27

Thank you again for the valuable suggestions.

I checked online and in our notes. Documenting the findings here: (position independent code = pic; position dependent code = pdc)

  • ASLR usually does not change the .text segment, so our technique is usually not affected by it
  • modern compilers/linkers by default use pic (that is probably why it always worked with us on default windows)
  • from Windows 8 on, ASLR is the standard and requires pdc to have relocation tables

So we could add another column to show if there are relocation tables/pdc.

As for why we went with powershell: I tried VQL first but I could not figure out how to iterate over a binary byte by byte. A diff or comparison is possible, but we also need to ignore changes when it is just one byte. Any ideas how to implement this in VQL?

mdenzel avatar Oct 05 '25 07:10 mdenzel

Lautaro and me checked pic vs pdc:

  1. creating a pdc binary

gcc has an option -fno-pic and -fno-pie but both produce the same executable on Windows with mingw. PE-bear shows that only timestamp and checksum are different. Same for -flinker-output=exec.

So, we are not sure how to even create a binary with pdc for testing. Might it be that modern compilers do not even have the option to create pdc any more?

  1. detecting a pdc binary

Relocation tables exist also for pic binaries. There is a value e_crlc that showed the relocations but it seems to be deprecated (source: https://docs.rs/goblin/latest/goblin/pe/header/struct.DosHeader.html#structfield.relocations).

Also, according to https://stackoverflow.com/questions/73221196/is-there-a-way-to-tell-if-a-windows-binary-is-a-pie people think there is no flag to tell if a windows binary is pic.

mdenzel avatar Oct 05 '25 22:10 mdenzel

Thanks for starting this discussions - I think this will end up being a very cool artifact.

I did look at it today and played with the VQL to make it faster and more efficient. I also wanted to see how many false positives there were.

This is my improved version

name: Windows.Memory.Mem2Disk
author: Lautaro Lecumberry, Dr. Michael Denzel
description: |
    This artifact compares executables in memory (RAM) with those
    on hard disk. This way, RAM injections are detected. This rarely
    happens legitimately and is mostly used by malware.
    This check is executed without dumping the memory and works live
    on the target system(s).

parameters:
- name: IgnoreOneByteOffsets
  description: Relative Virtual Adresses (RVA) cause an offset in the code in memory of a process.
               This is the case when the field BaseOfData is set to 0x8000. It creates false
               positives and is fairly safe to ignore (1-byte injections are really hard).
  default: True
  type: bool
- name: UploadFindings
  description: Upload all executables where code in memory does not match code on disk. This
               can potentially generate a lot of traffic. Dry-run before enabling this option.
  default: False
  type: bool
- name: ProcessNameFilter
  type: regex
  default: notepad

precondition: SELECT OS From info() where OS = 'windows'

export: |
  // These functions help to resolve the Kernel Device Filenames
  // into a regular filename with drive letter.
  LET DriveReplaceLookup <= SELECT
      split(sep_string="\\", string=Name)[-1] AS Drive,
      upcase(string=SymlinkTarget) AS Target,
      len(list=SymlinkTarget) AS Len
    FROM winobj()
    WHERE Name =~ "^\\\\GLOBAL\\?\\?\\\\.:"
  
  LET _DriveReplace(Path) = SELECT Drive + Path[Len:] AS ResolvedPath
    FROM DriveReplaceLookup
    WHERE upcase(string=Path[:Len]) = Target
  
  LET DriveReplace(Path) = _DriveReplace(Path=Path)[0].ResolvedPath ||
      Path

sources:
- query: |
    -- get all processes
    LET GetPids = SELECT Pid,
                         Name,
                         Username
      FROM pslist()
      WHERE Name =~ ProcessNameFilter
    
    -- get all memory pages for a certain pid
    LET InfoFromVad(Pid) = SELECT Address,
                                  Size,
                                  DriveReplace(Path=MappingName) AS Path
      FROM vad(pid=Pid)
      WHERE MappingName
       AND Protection =~ "xr-"
            AND MappingName =~ "(exe)$"
      LIMIT 1
    
    LET GetTextSegment(Path) = filter(condition="x=>x.Name = '.text'",
                                      list=parse_pe(file=Path).Sections)[0]
    
    -- parse the executable (PE) from memory (specifically, the text segment)
    LET GetMetadata(Pid, Name) = SELECT
        Path,
        str(str=Pid) AS PidFilename,
        Address,
        GetTextSegment(Path=Path) AS TextSegmentData
      FROM InfoFromVad(Pid=Pid)
      WHERE Address != 0
       AND TextSegmentData.FileOffset
    
    LET Hex(X) = format(format="%#x", args=X)
    
    -- read the executable from memory and hard disk
    LET GetContent(Pid, Name) = SELECT *, Address AS MemAddress,
                                       read_file(
                                         accessor="process",
                                         offset=Address,
                                         filename=PidFilename,
                                         length=TextSegmentData.Size) AS MemoryData,
                                       hash(
                                         path=PidFilename,
                                         accessor="process",
                                         hashselect="SHA256").SHA256 AS MemorySHA256,
                                       TextSegmentData.FileOffset AS DiskAddress,
                                       TextSegmentData.Size AS SegmentSize,
                                       read_file(
                                         accessor="file",
                                         offset=TextSegmentData.FileOffset,
                                         filename=Path,
                                         length=TextSegmentData.Size) AS DiskData,
                                       hash(
                                         path=Path,
                                         accessor="file",
                                         hashselect="SHA256").SHA256 AS DiskSHA256
      FROM GetMetadata(
        Name=Name,
        Pid=Pid)
      WHERE MemoryData
       AND log(
             dedup=-1,
             message="Inspecting Pid %v (%v): %#x-%#x vs %#x-%#x",
             args=[Pid, Name, Address, Address + SegmentSize,
               DiskAddress, DiskAddress + SegmentSize])
    
    -- Filter out not needed comparisons early
    LET FilterContent(Pid, Name) = SELECT *, MemoryData = DiskData AS Comparison
      FROM GetContent(Pid=Pid, Name=Name)
      WHERE NOT Comparison
    
    -- Dict stored as query, so it only gets executed once
    LET Tmp <= dict(a=0)
    
    LET Cmp(X, Y) = SELECT X[_value] = Y[_value]  AND X[1] = Y[1] AS Eq
      FROM range(end=len(list=X), step=2)
      WHERE set(item=Tmp,
                field="a",
                value=if(condition=Eq  AND Tmp.a < 2, then=0, else=Tmp.a + 1))
       AND Tmp.a > 2
      LIMIT 1
    
    LET CheckOneByteChanges(X, Y) = (X = Y
         AND log(message="Comparing %v quickly", dedup=-1, args=len(list=X))) OR (
          set(item=Tmp, field="a", value=0)
         && Cmp(X=X, Y=Y))
    
    -- compare the executable from memory and hard disk
    -- only print the ones where they do not match
    LET Compare(Pid, Name) = if(
        condition=log(message="Comparing process %v", args=Pid)
         AND IgnoreOneByteOffsets,
        then={
        SELECT Pid,
               PidFilename,
               Path,
               NOT CheckOneByteChanges(X=MemoryData, Y=DiskData) AS OneByteOffset,
               Comparison,
               MemorySHA256,
               DiskSHA256,
               MemAddress,
               DiskAddress,
               SegmentSize
        FROM FilterContent(Pid=Pid, Name=Name)
        WHERE NOT OneByteOffset
      },
        else={
        SELECT Pid,
               PidFilename,
               Path,
               Comparison,
               MemorySHA256,
               DiskSHA256,
               MemAddress,
               DiskAddress,
               SegmentSize
        FROM FilterContent(Pid=Pid, Name=Name)
      })
    
    -- compare with uploading the suspicious executables
    LET CompareAndUpload(Pid, Name) = SELECT
        Pid,
        Path,
        Hex(X=MemAddress) AS MemAddress,
        Hex(X=DiskAddress) AS DiskAddress,
        Hex(X=SegmentSize) AS SegmentSize,
        upload(
          file=pathspec(DelegateAccessor="process",
                        DelegatePath=PidFilename,
                        Path=[dict(Offset=MemAddress, Length=SegmentSize), ]),
          name=pathspec(parse=format(format="%s.%d.mem", args=[Path, Pid]),
                        path_type="windows"),
          accessor="sparse") AS UploadMem,
        upload(
          file=pathspec(DelegateAccessor="file",
                        DelegatePath=Path,
                        Path=[dict(Offset=DiskAddress, Length=SegmentSize), ]),
          name=pathspec(parse=format(format="%s.%d.disk", args=[Path, Pid]),
                        path_type="windows"),
          accessor="sparse") AS UploadDisk
      FROM Compare(Pid=Pid, Name=Name)
    
    -- for every process, evaluate the memory-harddisk-comparison
    SELECT *
    FROM foreach(row=GetPids,
                 workers=20,
                 query={
        SELECT *
        FROM if(condition=UploadFindings,
                then={
        SELECT *
        FROM CompareAndUpload(Pid=Pid, Name=Name)
      },
                else={
        SELECT *
        FROM Compare(Pid=Pid, Name=Name)
      })
      })

There were a couple of smaller issues:

  1. The first issue is that the MappingName returned by the VAD plugin are in kernel notation - they need to be converted to a path before we can open the file (for example \Device\HarddiskVolume3\velociraptor.exe should be C:/velociraptor.exe)

I added the code to convert back to regular paths by inspecting the object directory in the kernel object manager.

  1. I also added more logging so we can see exactly what it is trying to do.
  2. Additionally I optimized the code to just check the two strings for equality - most of the time they will be equal so there is no need to fall back to byte by byte comparisons.
  3. I also added proper sparse upload of the regions if they were different.

After playing with the artifact I found some false positives on a clean system. In particular velociraptor.exe was a FP - I uploaded both the mem and disk versions and they were almost identical except of qword at offset 0x1D41166 (Marked with -> )

01D41150   E9 DB F3 2B  FE 90 90 90  90 90 90 90  90 90 90 90  FF FF FF FF  FF FF FF FF->80 1E 7F C3  F7 7F 00 00  50 21 4B C5  F7 7F 00 00  ...+............................P!K.....
01D41178   00 00 00 00  00 00 00 00  FF FF FF FF  FF FF FF FF  00 00 00 00  00 00 00 00  00 00 00 00  00 00 00 00  00 00 00 00  00 00 00 00  ........................................
01D411A0   00 00 00 00  00 00 00 00  00 00 00 00  00 00 00 00  00 00 00 00  00 00 00 00  00 00 00 00  00 00 00 00  00 00 00 00  00 00 00 00  ........................................
01D411C8   00 00 00 00  00 00 00 00  00 00 00 00  00 00 00 00  00 00 00 00  00 00 00 00  00 00 00 00  00 00 00 00  00 00 00 00  00 00 00 00  ........................................
01D411F0   00 00 00 00  00 00 00 00  00 00 00 00  00 00 00 00                                                                                ................
---  velociraptor.exe.7244.mem       --0x1D411F1/0x1D41200--100%---------------------------------------------------------------------------------------------------------------------
01D41150   E9 DB F3 2B  FE 90 90 90  90 90 90 90  90 90 90 90  FF FF FF FF  FF FF FF FF ->80 1E 08 40  01 00 00 00  50 21 D4 41  01 00 00 00  [email protected]!.A....
01D41178   00 00 00 00  00 00 00 00  FF FF FF FF  FF FF FF FF  00 00 00 00  00 00 00 00  00 00 00 00  00 00 00 00  00 00 00 00  00 00 00 00  ........................................
01D411A0   00 00 00 00  00 00 00 00  00 00 00 00  00 00 00 00  00 00 00 00  00 00 00 00  00 00 00 00  00 00 00 00  00 00 00 00  00 00 00 00  ........................................
01D411C8   00 00 00 00  00 00 00 00  00 00 00 00  00 00 00 00  00 00 00 00  00 00 00 00  00 00 00 00  00 00 00 00  00 00 00 00  00 00 00 00  ........................................
01D411F0   00 00 00 00  00 00 00 00  00 00 00 00  00 00 00 00                                                                                ................
---  velociraptor.exe.7244.disk       --0x1D411D8/0x1D41200--100%---------

I then inspected the base address of the image in memory and the PE file:

SELECT format(format="%#x", args=Address) AS AddressHex,
       *
FROM vad(pid=getpid())
WHERE MappingName =~ ".exe$"

SELECT
    format(format="%#x",
           args=parse_pe(file="C:/velociraptor.exe").Sections[0].VMA)
FROM scope()
image

You can see that the VMA (Virtual Memory Address) in the PE header is 0x140001000 and the actual address in memory is 0x7ff7c3771000 . Compare the bytes that have changed between the two images:

memory: 7FF7C37F1E80 disk: 000140081E80

and 0x7FF7C37F1E80 - 0x7ff7c3771000 = 0x000140081E80 - 0x140001000

So you can see how the address was fixed from the disk image to the memory image by the loader - this is what I meant by relocations - the loader will compensate the addressed by the ASLR amount.

In my testing this is not very common at least in this binary but there were two binaries that did have relocations. To properly account for this we need to calculate the relative offset that should have been added (similar to the calculation above) and see if it all adds up.

scudette avatar Oct 06 '25 15:10 scudette

Wow, thanks for all the work!

Hm, I am thinking how to solve this issue without recalculating the entire relocation tables - is checking only the ASLR offset enough? Let's talk offline. I sent you a message on Discord.

mdenzel avatar Oct 06 '25 20:10 mdenzel

@scudette maybe we should add in the device path conversion exports into the VAD artifact (or another main project artifact) so we can import them easily?

mgreen27 avatar Oct 07 '25 22:10 mgreen27

Yeah I have a PR with that already - will send soon.

scudette avatar Oct 07 '25 23:10 scudette

Hi, we updated our query to check for the offsets and add them to a dictionary (as Mike suggested) but it's very slow and it times out. Can you have a look into it?

The problem should be in the following part of the code:

    LET OffsetsTmp <= dict()

    LET CheckOffsetsHelper(X, Y) = SELECT 
        atoi(string=format(format="0x%x", args=substr(str=X, start=_value, end=AddressLength+_value)))
          - atoi(string=format(format="0x%x", args=substr(str=Y, start=_value, end=AddressLength+_value))) AS Difference
      FROM range(end=len(list=X), step=AddressLength)
      WHERE if(condition=Difference!=0,
        then=set(
          item=OffsetsTmp, 
          field=Difference, 
          value=if(
            condition=OffsetsTmp[format(format="%d", args=Difference)], 
            then=1+OffsetsTmp[format(format="%d", args=Difference)], 
            else=1
          )
        )
      )

PS: Artifact code is not yet finished, we have one TODO remaining. Please, don't merge the PR yet

lautarolecumberry avatar Oct 13 '25 22:10 lautarolecumberry

I played with this a bit more today.

Because VQL is an interpreted language it is much faster to compare strings than to iterate over each int one at the time. As you already found out most of the time relocations are not used so we can save a lot of time by comparing large buffers and eliminating those which are similar. Also memory is always allocated in blocks of page size and the relocations (on 64 bit systems) must be aligned to 8 bytes.

The following code quickly finds all 8 byte ints which are different between two regions. The trick is to first read in 1mb blocks and only find those 1MB blocks which are different. Then for each of those break them up into 4096 by blocks to only find those different. Then finally check each 8 byte within the smaller blocks.

-- Format the items as a hex string
LET Hex(X) = format(format="%x", args=[X,])

LET _CompareRegions(Base, X, Y, PAGESIZE) = SELECT
    _value + Base AS Offset,
    X[_value:(_value + PAGESIZE)] AS XInt,
    Y[_value:(_value + PAGESIZE)] AS YInt
  FROM range(end=len(list=X), step=PAGESIZE)
  WHERE XInt != YInt

LET CompareRegions(X, Y) = SELECT Offset,
                                  Hex(X=XInt) AS X,
                                  Hex(X=YInt) AS Y
  FROM foreach(row={
    SELECT *
    FROM _CompareRegions(Base=0, X=X, Y=Y, PAGESIZE=1024 * 1024)
  },
               query={
    SELECT *
    FROM foreach(row={
    SELECT *
    FROM _CompareRegions(Base=Offset, X=XInt, Y=YInt, PAGESIZE=4096)
  },
                 query={
    SELECT *
    FROM _CompareRegions(Base=Offset, X=XInt, Y=YInt, PAGESIZE=8)
  })
  })

Applying this code to the Velociraptor binary itself gives two relocations and runs in under a second

image

You can see the first few bytes (Low order bytes) are the same but the end bytes are different. If we converted them to an int and subtract it would be a constant.

scudette avatar Oct 16 '25 04:10 scudette

Playing with it some more I found some edge cases

  • For 32 bit processes we need to compare 4 bytes instead of 8 bytes.
  • Some processes we can not read their memory at all I get that with LsaIso.exe because I think it is isolated

I can calculate the ASLR shift and mark all other differences which are not equal to the ASLR shift (which is expected for relocations).

Also I expected trampoline hooks to be inserted into ntdll.dll and ntkern.dll not really into the executable itself so I am widening the search to include those DLLs as well as the executable.

This works pretty well accounting to those edge cases. It takes about 10 seconds to run on my system

This is the current notebook query I am using

LET ProcessNameFilter <= "."

LET ModuleRegEx <= "(KERNELBASE|ntdll).dll|.exe"

LET UploadFindings <= FALSE

LET IgnoreOneByteOffsets <= TRUE

// These functions help to resolve the Kernel Device Filenames
// into a regular filename with drive letter.
LET DriveReplaceLookup <= SELECT
    split(sep_string="\\", string=Name)[-1] AS Drive,
    upcase(string=SymlinkTarget) AS Target,
    len(list=SymlinkTarget) AS Len
  FROM winobj()
  WHERE Name =~ "^\\\\GLOBAL\\?\\?\\\\.:"

LET _DriveReplace(Path) = SELECT Drive + Path[Len:] AS ResolvedPath
  FROM DriveReplaceLookup
  WHERE upcase(string=Path[:Len]) = Target

LET DriveReplace(Path) = _DriveReplace(Path=Path)[0].ResolvedPath ||
    Path

-- get all processes
LET GetPids = SELECT Pid,
                     Name,
                     Username,
                     if(condition=IsWow64, then=4, else=8) AS IntSize
  FROM pslist()
  WHERE Name =~ ProcessNameFilter

-- get all memory pages for a certain pid
LET InfoFromVad(Pid) = SELECT Address,
                              Size,
                              DriveReplace(Path=MappingName) AS Path
  FROM vad(pid=Pid)
  WHERE MappingName
   AND Protection =~ "xr-"
        AND MappingName =~ ModuleRegEx
  LIMIT 1

LET GetTextSegment(Path) = filter(condition="x=>x.Name = '.text'",
                                  list=parse_pe(file=Path).Sections)[0]

-- parse the executable (PE) from memory (specifically, the text segment)
LET GetMetadata(Pid) = SELECT Path,
                              str(str=Pid) AS PidFilename,
                              Address,
                              GetTextSegment(Path=Path) AS TextSegmentData
  FROM InfoFromVad(Pid=Pid)
  WHERE Address != 0
   AND TextSegmentData.FileOffset

LET Hex(X) = format(format="%#x", args=X)

-- read the executable from memory and hard disk
LET GetContent(Pid, Name) = SELECT
    *, Name,
    Address AS MemAddress,
    Hex(X=Address - TextSegmentData.VMA) AS ASLR, 
    read_file(accessor="process",
              offset=Address,
              filename=PidFilename,
              length=TextSegmentData.Size) AS MemoryData,
    TextSegmentData.FileOffset AS DiskAddress,
    TextSegmentData.Size AS SegmentSize,
    read_file(accessor="file",
              offset=TextSegmentData.FileOffset,
              filename=Path,
              length=TextSegmentData.Size) AS DiskData
  FROM GetMetadata(Pid=Pid)
  WHERE MemoryData
   AND log(dedup=-1,
           message="Inspecting Pid %v (%v): %#x-%#x vs %#x-%#x",
           args=[Pid, Name, Address, Address + SegmentSize,
             DiskAddress, DiskAddress + SegmentSize])

-- Filter out not needed comparisons early
LET FilterContent(Pid, Name) = SELECT *, MemoryData = DiskData AS Comparison
  FROM GetContent(Pid=Pid, Name=Name)
  WHERE NOT Comparison

LET PAGESIZE <= 1024 * 1024

LET Hex(X) = format(format="%#x", args=[X, ])

LET Int64(X) = parse_binary(profile="",
                            struct="int64",
                            accessor="data",
                            filename=X)

LET _CompareRegions(Base, X, Y, PAGESIZE) = SELECT
    _value + Base AS Offset,
    X[_value:(_value + PAGESIZE)] AS XInt,
    Y[_value:(_value + PAGESIZE)] AS YInt
  FROM range(end=len(list=X), step=PAGESIZE)
  WHERE XInt != YInt

LET CompareRegions(X, Y, IntSize) = SELECT
    Offset,
    Hex(X=XInt) AS X,
    Hex(X=YInt) AS Y,
    Hex(X=Int64(X=XInt) - Int64(X=YInt)) AS Difference
  FROM foreach(row={
    SELECT *
    FROM _CompareRegions(Base=0, X=X, Y=Y, PAGESIZE=PAGESIZE)
  },
               query={
    SELECT *
    FROM foreach(row={
    SELECT *
    FROM _CompareRegions(Base=Offset, X=XInt, Y=YInt, PAGESIZE=4096)
  },
                 query={
    SELECT *
    FROM _CompareRegions(Base=Offset, X=XInt, Y=YInt, PAGESIZE=IntSize)
  })
  })
  LIMIT 500

LET CompareUniqueRegions(X, Y, IntSize, ASLR) = SELECT *
  FROM CompareRegions(X=X, Y=Y, IntSize=IntSize)
  WHERE Difference != ASLR
  GROUP BY Difference

-- compare the executable from memory and hard disk
-- only print the ones where they do not match
LET Compare(Pid, Name, IntSize) = if(
    condition=log(message="Comparing process %v", args=Pid)
     AND IgnoreOneByteOffsets,
    then={
    SELECT Pid,
           Name,
           ASLR,
           IntSize,
           PidFilename,
           Path,
           {
    SELECT Offset,
           X AS MemoryValue,
           Y AS DiskValue,
           Difference,
           Hex(X=ASLR) AS ASLR
    FROM CompareUniqueRegions(X=MemoryData, Y=DiskData, IntSize=IntSize, ASLR=ASLR)
    } AS Differences,
           MemAddress,
           DiskAddress,
           SegmentSize
    FROM FilterContent(Pid=Pid, Name=Name)
    WHERE NOT OneByteOffset
  },
    else={
    SELECT Pid,
           Name,
           ASLR,
           IntSize,
           PidFilename,
           Path,
           MemAddress,
           DiskAddress,
           SegmentSize
    FROM FilterContent(Pid=Pid, Name=Name)
  })

-- compare with uploading the suspicious executables
LET CompareAndUpload(Pid, Name, IntSize) = SELECT
    Pid,
    Path,
    Hex(X=MemAddress) AS MemAddress,
    Hex(X=DiskAddress) AS DiskAddress,
    Hex(X=SegmentSize) AS SegmentSize,
    upload(
      file=pathspec(DelegateAccessor="process",
                    DelegatePath=PidFilename,
                    Path=[dict(Offset=MemAddress, Length=SegmentSize), ]),
      name=pathspec(parse=format(format="%s.%d.mem", args=[Path, Pid]),
                    path_type="windows"),
      accessor="sparse") AS UploadMem,
    upload(
      file=pathspec(DelegateAccessor="file",
                    DelegatePath=Path,
                    Path=[dict(Offset=DiskAddress, Length=SegmentSize), ]),
      name=pathspec(parse=format(format="%s.%d.disk", args=[Path, Pid]),
                    path_type="windows"),
      accessor="sparse") AS UploadDisk
  FROM Compare(Pid=Pid, Name=Name, IntSize=IntSize)

-- for every process, evaluate the memory-harddisk-comparison
SELECT *
FROM foreach(row=GetPids,
             workers=20,
             query={
    SELECT *
    FROM if(condition=UploadFindings,
            then={
    SELECT *
    FROM CompareAndUpload(Pid=Pid, Name=Name, IntSize=IntSize)
  },
            else={
    SELECT *
    FROM Compare(Pid=Pid, Name=Name, IntSize=IntSize)
  })
  })


scudette avatar Oct 16 '25 06:10 scudette

Hi Mike, thanks for the feedback! It runs much faster than the previous version. I've integrated the changes into the artifact and commited the changes. The PR is now ready to be merged :)

lautarolecumberry avatar Oct 16 '25 13:10 lautarolecumberry

I suggest a default value for ModuleRegEx of . so we scan every module. We could add e.g. "(KERNELBASE|ntdll).dll|.exe" in the description, so people know how to use this field.

Also alignment in lines 134 and 140 broke.

(unfortunately, I do not have edit-rights otherwise I would have changed it myself)

@lautarolecumberry: see https://github.com/lautarolecumberry/DetectingFilelessMalware/pull/6

mdenzel avatar Oct 18 '25 07:10 mdenzel

I think scanning every module will be very slow in practice. I don't think it's that useful for a default

Perhaps have a choice type parameter with something like all, common modules (default) or custom which will allow a regex

scudette avatar Oct 18 '25 08:10 scudette

I think scanning every module will be very slow in practice. I don't think it's that useful for a default

Perhaps have a choice type parameter with something like all, common modules (default) or custom which will allow a regex

Good idea, that sounds very sensible. What could be some common modules? The binary itself (e.g. notepad.exe), kernelbase.dll, ntdll.dll, user32.dll, kernel32.dll, shell32.dll, msvcrt.dll, advapi32.dll, comdlg32.dll?

mdenzel avatar Oct 18 '25 08:10 mdenzel

Found a bug (the OneByteOffset variable was removed but is still used).

Could you give me access to the PR? Then we make the PR as draft and will mark it ready once @lautarolecumberry and I fixed and tested it.

mdenzel avatar Oct 18 '25 20:10 mdenzel

Do you know what OneByteOffset is used for? I thought it was handling the case of relocations before there was real code there to handle it - I think this should be removed completely now.

Probably there is also no need to have two code paths now - one with ignoring and one without - we probably should have the relocation code run always.

scudette avatar Oct 21 '25 23:10 scudette

Yes, OneByteOffset is handling BaseOfData relocations. See https://0xrick.github.io/win-internals/pe4/ and https://learn.microsoft.com/en-us/windows/win32/debug/pe-format#optional-header-image-only

It only happens in 32 bit binaries but if it is enabled it happens very often. E.g. for Firefox 32-bit:

Memory Disk Times Difference
0x44 0x06 2724 62 (0x3e)
0x45 0x07 2385 62 (0x3e)
0x42 0x04 21 62 (0x3e)
0x40 0x02 27 62 (0x3e)
0x43 0x05 12 62 (0x3e)
0x41 0x03 1 62 (0x3e)

It is kind of interesting that this happens way more often than ASLR relocations. So we will have two constant offsets in 32 bit binaries (ASLR and BaseOfData) which we need to ignore. I think the new code should take care of it, but I am not 100% sure. We should definitely clean up the old variable and make sure we did not break anything.

We're on it and will update the PR this week to fix it.

mdenzel avatar Oct 22 '25 06:10 mdenzel

The latest commit includes fixes for the BaseOfData left-overs and code to take care of 1-4 byte shifts. I am not entirely sure why they happen, I guess the BaseOfData offsets are not aligned. Anyway, the code checks if the ASLR value is shifted 1-3 bytes and ignores it.

Can you have a look and tell us what you think?

mdenzel avatar Nov 01 '25 15:11 mdenzel

I updates the query to support all permutations of a shifted ASLR - I found evidence of ASLR shifts in 64 bit code as well so this does not seem confined to 32 bit only.

Now running on my test system I only see hits for a process like chrome - I think this is expected as chrome is known to do some fancy hooking in memory. Maybe we need to allow list chrome or filter it out.

image

scudette avatar Nov 02 '25 07:11 scudette

I also noticed chrome several times when analyzing RAM. It does weird memory manipulations, though it did not flag up with me in Mem2Disk.

I would not filter out chrome by default and leave it to the analyst to decide. I can imagine attackers targeting chrome (or edge). Maybe it would be good if we add a known false positive section in the comments and mention chrome and chrome-based browsers?

If edge shows up in the Mem2Disk detection, I think an analyst should investigate it deeper.

mdenzel avatar Nov 02 '25 09:11 mdenzel

Ok - let me know when you are ready to merge this - we can always iterate on it later

scudette avatar Nov 02 '25 14:11 scudette

This looks pretty good - we can iterate over it in future.

scudette avatar Nov 14 '25 02:11 scudette