binaryninja-api icon indicating copy to clipboard operation
binaryninja-api copied to clipboard

Change type of control/segment register (in whole-image scope)

Open mishka-freddy opened this issue 7 months ago • 5 comments

What is the feature you'd like to have? When I reversing windows kernel internals, often ntoskrnl.exe refers to gsbase register, for windows it points to _KPCR structure. I can change type in one function where it used, but I haven't found feature to change it in every place where it accessed for current image. So, it would be great feature to do, to make kernel analysis easier.

mishka-freddy avatar May 24 '25 13:05 mishka-freddy

There has been some internal discussion on this, though I am not sure whether we already have an issue for it

xusheng6 avatar May 27 '25 05:05 xusheng6

I wrote simple py script to change ~all gsbase variables in code PKPCR

import itertools
import multiprocessing
from concurrent.futures import ThreadPoolExecutor

from binaryninja import (log_info, log_error, BinaryView, RegisterName, LowLevelILLoad, LowLevelILReg)

def type_gsbase_as_kpcr(bv: BinaryView):
    arch = bv.arch
    gs_reg_name = RegisterName("gsbase")
    if gs_reg_name not in arch.regs:
        log_error("Current architecture does not have a 'gsbase' segment register.")
        return

    gs_reg_info = arch.regs[gs_reg_name]
    gs_reg_index = gs_reg_info.index

    kpcr_ptr_type = bv.get_type_by_name("PKPCR")
    if kpcr_ptr_type is None:
        log_error("Type 'PKPCR' not found in this BinaryView.  Please define/import a struct PKPCR first.")
        return

    patched_counter = itertools.count()
    function_counter = itertools.count()

    function_count = len(bv.functions)


    def resolve_function(func):
        log_info(f"Iterate {next(function_counter)} of {function_count}: {func.name}")

        if func.low_level_il is None:
            return

        for block in func.low_level_il:
            for instr in block:
                pass

                operands = instr.operands
                if len(operands) != 2 or not isinstance(operands[1], LowLevelILLoad):
                    continue

                # noinspection PyTypeChecker
                load_operand: LowLevelILLoad = operands[1]
                load_register = load_operand.src.operands[0]
                if not isinstance(load_register, LowLevelILReg):
                    continue

                if load_register.src.index != gs_reg_index:
                    continue

                mlil_instr = instr.mlil
                if mlil_instr is None:
                    continue

                gsbase_var = mlil_instr.vars_read[0]
                if gsbase_var.type != kpcr_ptr_type:
                    gsbase_var.set_type_async(kpcr_ptr_type)
                    next(patched_counter)
                    log_info(f"Patched GS base variable for {func.name}")

    executor = ThreadPoolExecutor(max_workers=multiprocessing.cpu_count())
    for fn in bv.functions:
        executor.submit(resolve_function, fn)

    executor.shutdown()
    log_info(f"[TypeGSBASE] Finished: updated {patched_counter} GSBASE references to `_KPRC *`.")

if __name__ == "__main__":
    global bv
    bv.set_analysis_hold(True)
    type_gsbase_as_kpcr(bv)
    bv.set_analysis_hold(False)
    bv.update_analysis()

mishka-freddy avatar May 31 '25 15:05 mishka-freddy

I have what is largely a fix for this that I'm hoping to get pushed in this release. Currently there is an issue that needs to be resolved which is that the Platform type has one definition of KPCR and the PDB has another implementation. Ideally the PDB overwrites the Platform's definition and then my code below would "just work" however this would require a core change.

diff --git a/platform/windows-kernel/platform_windows_kernel.cpp b/platform/windows-kernel/platform_windows_kernel.cpp
index da303f9f..6cd35272 100644
--- a/platform/windows-kernel/platform_windows_kernel.cpp
+++ b/platform/windows-kernel/platform_windows_kernel.cpp
@@ -10,11 +10,15 @@ Ref<Platform> g_windowsKernelX86, g_windowsKernelX64, g_windowsKernelArm64;
 
 class WindowsKernelX86Platform : public Platform
 {
+	uint32_t m_fsbase;
+	Ref<Type> m_kpcr;
+	std::mutex m_kpcrMutex;
 public:
 	WindowsKernelX86Platform(Architecture* arch) : Platform(arch, "windows-kernel-x86")
 	{
 		Ref<CallingConvention> cc;
 
+		m_fsbase = arch->GetRegisterByName("fsbase");
 		cc = arch->GetCallingConventionByName("cdecl");
 		if (cc)
 		{
@@ -54,16 +58,37 @@ public:
 			return g_windowsKernelX86;
 		return nullptr;
 	}
+
+	virtual void BinaryViewInit(BinaryView* view) override
+	{
+		// Locking here so that if we have two views in BinaryViewInit at once we don't race to init m_teb.
+		std::lock_guard<std::mutex> lock(m_kpcrMutex);
+		if (!m_kpcr)
+			m_kpcr = Type::PointerType(GetArchitecture()->GetAddressSize(), Type::NamedType(QualifiedName("_KPCR"), GetTypeByName(QualifiedName("_KPCR"))));
+	}
+
+	virtual Ref<Type> GetGlobalRegisterType(uint32_t reg) override
+	{
+		if (reg == m_fsbase)
+			return m_kpcr;
+
+		return nullptr;
+	}
 };
 
 
 class WindowsKernelX64Platform : public Platform
 {
+	uint32_t m_gsbase;
+	Ref<Type> m_kpcr;
+	std::mutex m_kpcrMutex;
+
 public:
 	WindowsKernelX64Platform(Architecture* arch) : Platform(arch, "windows-kernel-x86_64")
 	{
 		Ref<CallingConvention> cc;
 
+		m_gsbase = arch->GetRegisterByName("gsbase");
 		cc = arch->GetCallingConventionByName("win64");
 		if (cc)
 		{
@@ -92,6 +117,18 @@ public:
 			return g_windowsKernelX64;
 		return nullptr;
 	}
+
+	virtual void BinaryViewInit(BinaryView* view) override
+	{
+	}
+
+	virtual Ref<Type> GetGlobalRegisterType(uint32_t reg) override
+	{
+		if (reg == m_gsbase)
+			return Type::PointerType(GetArchitecture()->GetAddressSize(), Type::NamedType(QualifiedName("_KPCR"), GetTypeByName(QualifiedName("_KPCR"))));
+
+		return nullptr;
+	}
 };
 

You then get this kind of output:

Image

plafosse avatar Jun 11 '25 19:06 plafosse

Currently there is an issue that needs to be resolved which is that the Platform type has one definition of KPCR and the PDB has another implementation. Ideally the PDB overwrites the Platform's definition

About this, I agree, it was kinda annoying when for most structures in ntoskrnl.exe it creates two versions of type from platform (that almost always incomplete), and from PDB with _1 ending. That require me to write script to apply PDB types as default.

mishka-freddy avatar Jun 11 '25 21:06 mishka-freddy

Currently there is an issue that needs to be resolved which is that the Platform type has one definition of KPCR and the PDB has another implementation. Ideally the PDB overwrites the Platform's definition

About this, I agree, it was kinda annoying when for most structures in ntoskrnl.exe it creates two versions of type from platform (that almost always incomplete), and from PDB with _1 ending. That require me to write script to apply PDB types as default.

Created feature request for this

mishka-freddy avatar Jun 12 '25 13:06 mishka-freddy