capstone icon indicating copy to clipboard operation
capstone copied to clipboard

Ignore ES/CS/SS/DS segment overrides in x64 mode

Open jxors opened this issue 1 month ago • 5 comments

Your checklist for this pull request

  • [ ] I've documented or updated the documentation of every API function and struct this PR changes.
  • [ ] I've added tests that prove my fix is effective or that my feature works (if possible)

Detailed description

This draft PR attempts to fix the decoding of x64 instructions with ignored segment overrides. The typical behavior of CPUs, which is copied by most disassemblers, is to completely ignore ES/CS/SS/DS segment overrides and use the last FS/GS override, if any.

...

Test plan

As requested by @Rot127 this is a draft to quickly see if any tests fail. I have not yet added any new tests.

In particular I am unsure about whether my changes correctly cover the cases where 26 and 36 are used as branch (not) taken hints rather than segment overrides.

...

closes #2818

...

jxors avatar Nov 20 '25 17:11 jxors

Seems to not break anything. I need to read into it a little though. But generally looks good.

Rot127 avatar Nov 20 '25 18:11 Rot127

This is going to need a fair bit of testing. In particular, for multiple segment overrides in 32-bit mode all of the disassemblers (including Capstone) use first-seen. We'll want to make sure that is unchanged. I'm happy to help write a few tests cases.

hainest avatar Nov 20 '25 21:11 hainest

I'm happy to help write a few tests cases.

Thank you! This is my first time working with the capstone codebase, so I appreciate all help/advice.

I am reading up on Capstone's testing set up and I will try to write a few tests myself as well.

jxors avatar Nov 21 '25 15:11 jxors

Mark this as draft for now. Please change it back once you think the testing is enough.

Rot127 avatar Nov 24 '25 11:11 Rot127

I have added 21 test cases that cover various prefix combinations for 16, 32 and 64-bit modes and ensure notrack (reuses the DS segment override) is still decoded correctly.

I was not sure where to place the tests, so I have put them in a separate file for now.

@hainest if you have time, would you mind taking a look? Are there any cases that I missed?

jxors avatar Dec 09 '25 15:12 jxors

I would recommend explicitly checking that the correct prefix was found; e.g.,

  -
    input:
      name: "x86-16: rightmost segment override should take priority"
      bytes: [ 0x26, 0x65, 0x64, 0x3E, 0x65, 0x2E, 0x00, 0x00 ]
      arch: "CS_ARCH_X86"
      options: [ CS_MODE_16 ]
    expected:
      insns:
      -
        asm_text: "add byte ptr cs:[bx + si], al"
        details:
          x86:
            prefix: [ X86_PREFIX_0, X86_PREFIX_CS, X86_PREFIX_0, X86_PREFIX_0 ]

However, this raises an error:

Traceback (most recent call last):
  File "/home/tim/workspace/capstone-engine/capstone/bindings/python/cstest_py/src/cstest_py/cstest.py", line 320, in test
    return self.expected.compare(insns, self.input.arch_bits)
  File "/home/tim/workspace/capstone-engine/capstone/bindings/python/cstest_py/src/cstest_py/cstest.py", line 272, in compare
    if not compare_details(a_insn, e_insn.get("details")):
  File "/home/tim/workspace/capstone-engine/capstone/bindings/python/cstest_py/src/cstest_py/details.py", line 233, in compare_details
    if not compare_tbool(insn.writeback, expected.get("writeback"), "writeback"):
  File "/home/tim/workspace/capstone-engine/capstone/bindings/python/capstone/__init__.py", line 935, in writeback
    raise CsError(CS_ERR_DETAIL)
capstone.CsError: Details are unavailable (CS_ERR_DETAIL)

I've confirmed I built with CAPSTONE_BUILD_DIET:BOOL=OFF. @Rot127 any thoughts?

hainest avatar Dec 10 '25 20:12 hainest

Is there a way to directly check the derived segment override rather than the raw prefix?

Due to the fact that the DS segment override prefix, which is normally ignored on 64-bit, is also overloaded as notrack, my patch still needs to set the prefixes even when the segment override is ignored:

if (insn->mode != MODE_64BIT) {
	insn->segmentOverride = SEG_OVERRIDE_CS;
}
insn->prefix1 = byte;

So for an instruction with an ignored segment override this would still set prefix1 to that segment override.

It is not entirely clear to me what the best behavior would be here. Capstone expects there to be one relevant prefix per prefix group, which does not work well for notrack in 64-bit mode.

jxors avatar Dec 11 '25 12:12 jxors

@jxors

Is there a way to directly check the derived segment override rather than the raw prefix?

Currently not. The segment overwrite is not even exposed in the API. If you do a reference search on the segmentOverride member you'll see it is only used the add the respective register operand.

You can expose the segment override in the API as well. I think the best way to achieve this is:

  1. Add a new member x86_segmentOverride in MCInst and x86.h::cs_x86.
  2. Set MCInst->x86_segmentOverride in the disassembler where you current patches happen.
  3. Then copy the value in X86_getInstruction to the detail cs_x86 member.
  4. Lastly add the x86_segmentOverride field to the cstest. As example search the references of TestDetailX86::avx_rm and just do the same thing. You can also push it here and I help if you run into trouble.
  5. Then add it to cstest_py. This is way simpler. Just see how it is done in bindings/python/src/cstest_py/details.py for all the other members.

That said, I can't really say how useful this information is for the end user. I rarely had to do with x86 so can't say.

Besides that, one more question (I am really not that much into x86. So please correct me if I miss-understand something).

But the ISA says in Basic Architecture, Order Number 253665 - 3.3.7.1 Canonical Addressing (from June 2024):

If an instruction uses base registers RSP/RBP and uses a segment override prefix to specify a non-SS segment, a canonical fault generates a #GP (instead of an #SS). In 64-bit mode, only FS and GS segment-overrides are appli- cable in this situation. Other segment override prefixes (CS, DS, ES, and SS) are ignored. Note that this also means that an SS segment-override applied to a “non-stack” register reference is ignored. Such a sequence still produces a #GP for a canonical fault (and not an #SS).

Doesn't this mean it is fine to overwrite the prefixes? Because it is the semantically correct, right?

Another point, FS and GS segment overrides seem to be allowed in 64-bit mode. But the changes here ignore them as well?

Rot127 avatar Dec 11 '25 14:12 Rot127