pykdebugparser adding CPU Trace support

is it possible to add support for CPU Trace? (new in M4 machines)

I see that inside the trace file it is under tags '0a00010000000000' (in hex)

and under that tags there are multiple files, each with the name in the beginning.

Jul 23 '25 14:07 shmuelfomberg

Sure, could you please supply raw kdebug data for tests?

Jul 24 '25 06:07 matan1008

test_trace-010.atrc.zip

created by the "trace" command. here is some python code that I added to pykdebugparser/kd_buf_parser.py, in parse_v3, trying to parse the data:

TRACEV3_MACH_SYSCALLS = b'\x01\x20\x01\x00\x00\x00\x00\x00' TRACEV3_TRACE_STATISTICS = b'\x16\x80\x00\x00\x01\x00\x00\x00'

        elif block.tag == TRACEV3_MACH_SYSCALLS:
            mach_syscalls_data = json.loads(block.data)
            mach_syscalls = mach_syscalls_data["mach_syscalls"]
            # mach_syscalls is an array, each is a dict such as:
            # {
            #   "number": 12,
            #   "name": "mach_vm_deallocate",
            #   "arguments": ["mach_port_name_t target", "mach_vm_address_t address", "mach_vm_size_t size"]
            # }
        elif block.tag == TRACEV3_TRACE_STATISTICS:
            # a json contain satistics of the trace. maybe interesting keys:
            # "kdebugEventsSize" "chunkCount" "recordingInstructions" 
            pass
        elif block.tag.hex() in ['2080000000000000', 'ff51000000000000', '0880000000000000', 'fa51000000000000', '0250000001000100']:
            #print("============ TAG (plist): " + block.tag.hex())
            #print(plistlib.loads(block.data))
            pass
            # 2080000000000000 is empty
            # ff51000000000000 is about IOKit, huge amount of data
            # 0880000000000000 is display data
            # fa51000000000000 is empty
            # 0250000001000100 is about AGXDriverInfo
        elif block.tag.hex() == '0a00010000000000':
            # each is block of 16324 bytes, filled with leading zeros
            print("============ TAG: " + block.tag.hex())
            data = block.data
            ix = data.index(b'}')
            key = json.loads(data[0:ix+1])["Key"]
            print(key)
            data = data[ix+1:]
            ix = find_first_non_zero_byte(data)
            if ix is None:
                print("[Total size {} Leading Zeros {} Data left {}]".format(len(data), len(data), 0))
            else:
                print("[Total size {} Leading Zeros {} Data left {}]".format(len(data), ix, len(data[ix:])))
                data = data[ix:] 
                if key == "manifest.json":
                    aData = data.decode()
                    aData.replace("\\n", "\n")
                    print(aData)
                if 'UnitMarks' in key:
                    for ix in range(int(len(data)/24)):
                        start = ix * 24
                        print("{} {} {}".format(data[start:start+8].hex(), data[start+8:start+16].hex(), data[start+16:start+24].hex())) 
                if 'Ranges' in key:
                    if len(data) % 8 != 0:
                        data = b'\x00'*(8 - (len(data) % 8)) + data
                    for ix in range(int(len(data)/24)):
                        start = ix * 24
                        address1 = hex(int.from_bytes(data[start:start+4][::-1], byteorder='big') << 2)
                        flags1 = data[start+4:start+12].hex()
                        address2 = hex(int.from_bytes(data[start+12:start+16][::-1], byteorder='big') << 2)
                        third = data[start+16:start+24].hex()
                        print("{} {}".format(address1, flags1)) 
                        print("{} {}".format(address2, third)) 
                else:
                    print(data)
        else:
            print("============ TAG: " + block.tag.hex())
            print(block.data)

Jul 24 '25 08:07 shmuelfomberg

Sorry for the delay, do you have any resources about how to parse UnitMarks and Ranges?

Jul 31 '25 05:07 matan1008

sorry, I don't have anything.

Jul 31 '25 11:07 shmuelfomberg