ghidra icon indicating copy to clipboard operation
ghidra copied to clipboard

Fix `LOCK`, `XACQUIRE`, `XRELEASE` instruction prefix for x86

Open ekilmer opened this issue 2 years ago • 24 comments

Edit3: This PR should be ready for merging/internal edits and testing.

Edit2: This PR is mostly complete with respect to decoding. See this comment for latest update

Edit: This PR now also includes changes for XACQUIRE and XRELEASE prefixes because they are closely related to the LOCK prefix. (See here for details on that implementation https://github.com/NationalSecurityAgency/ghidra/pull/4336#issuecomment-1157664283)

I am looking for feedback (and suggestions for bugfixes) on the current implementation before fixing any other instructions and submitting for a real PR.


Summary:

This commit is a proof-of-concept (POC) implementation to fix how sleigh interprets the LOCK prefix on x86 architecture. Previously, the sleigh specification interpreted LOCK as a standalone instruction; however, that is wrong. The following is a description of a possible implementation to fix this. This implementation only fixes the INC and XADD instructions to demonstrate and hopefully determine whether this implementation is on the right track or if a better implementation exists that we can use.

The "LOCK" disassembly text appears after the mnemonic, which is similar to the REP prefixes. I do not know if making "LOCK" appear before a mnemonic like LOCK INC dword ptr[EAX] is possible or even preferable.

The Intel manual states the following for the LOCK prefix:

The LOCK prefix can be prepended only to the following instructions and only to those forms of the instructions where the destination operand is a memory operand: ADD, ADC, AND, BTC, BTR, BTS, CMPXCHG, CMPXCH8B, CMPXCHG16B, DEC, INC, NEG, NOT, OR, SBB, SUB, XOR, XADD, and XCHG. If the LOCK prefix is used with one of these instructions and the source operand is a memory operand, an undefined opcode exception (#UD) may be generated. An undefined opcode exception will also be generated if the LOCK prefix is used with any instruction not in the above list. The XCHG instruction always asserts the LOCK# signal regardless of the presence or absence of the LOCK prefix

Implementation:

Add lockprefx to contextreg to track the appearance of the prefix and identify the LOCK prefix bit pattern during instruction decoding. It can appear in any order along with the other prefixes. I increased the size of the context register because I do not understand the interaction of the existing fields, and it looks like there are no more free bits. I do not know whether increasing the context register size from 4 to 8 has any adverse or unforeseen effects elsewhere. Explanation, suggestions, or confirmation about this change would be greatly appreciated!

Use the with statement to protect against invalid LOCK prefixes on all instructions that cannot use the LOCK prefix. Instructions that can take the LOCK prefix (only INC and XADD in this POC) should close the with statement (closing } bracket) and handle the LOCK variants explicitly. The with statement is also added in the top-level '*.slaspec' files to surround the @include files for other instructions not in ia.sinc and prevent incorrect disassembly. I discovered this technique by noticing a similar structure in the 68000.sinc file with extGUARD.

Handle INC and XADD with LOCK prefix decoding only with a memory destination operand. I struggled a bit with this, but I think what I've done is okay? I had trouble getting anything else to compile and function correctly. Still, I attempted to split out the memory and register destination operands to only apply LOCK to the memory destination operands. Suggestions on whether a better alternative implementation exists are appreciated!

... admittedly, I only tested the disassembly result and did not inspect the generated pcode with the LOCK prefix. I'm hoping I correctly understood how the pcode is built and that everything lines up correctly. The LOCK() pcodeop should be first for each instruction followed by the normal semantics.


Addresses the bug report about LOCK in https://github.com/NationalSecurityAgency/ghidra/issues/2033

ekilmer avatar Jun 11 '22 20:06 ekilmer

The instruction bytes I've tested are the following. S_SUCCESS means Ghidra should correctly decode to the following disassembly text and S_FAILURE means Ghidra was correctly unable to disassemble the bytes.

      // ----------- INC
      // Normal disassembly with LOCK valid instruction and register destination
      {{0xff, 0xc0}, S_SUCCESS, "INC EAX"},
      // INC.LOCK EAX not valid because it is not a memory destination
      {{0xf0, 0xff, 0xc0}, S_FAILURE, ""},

      // Normal disassembly with LOCK valid instruction and memory destination
      {{0xff, 0x00}, S_SUCCESS, "INC dword ptr [RAX]"},
      // INC.LOCK dword ptr [RAX] is valid because memory destination
      {{0xf0, 0xff, 0x00}, S_SUCCESS, "INC.LOCK dword ptr [RAX]"},

      // Normal disassembly with LOCK valid instruction and sized memory
      // destination
      {{0x66, 0xff, 0x00}, S_SUCCESS, "INC word ptr [RAX]"},
      // INC.LOCK word ptr [RAX] is valid because memory destination
      {{0xf0, 0x66, 0xff, 0x00}, S_SUCCESS, "INC.LOCK word ptr [RAX]"},
      // Move the prefix around
      {{0x66, 0xf0, 0xff, 0x00}, S_SUCCESS, "INC.LOCK word ptr [RAX]"},

      // Normal disassembly with LOCK valid instruction and sized memory
      // destination and sized register
      {{0x66, 0x67, 0xff, 0x00}, S_SUCCESS, "INC word ptr [EAX]"},
      // INC.LOCK word ptr [EAX] is valid because memory destination
      {{0xf0, 0x66, 0x67, 0xff, 0x00}, S_SUCCESS, "INC.LOCK word ptr [EAX]"},
      // Move the prefix around
      {{0x66, 0xf0, 0x67, 0xff, 0x00}, S_SUCCESS, "INC.LOCK word ptr [EAX]"},
      // Move the prefix around
      {{0x66, 0x67, 0xf0, 0xff, 0x00}, S_SUCCESS, "INC.LOCK word ptr [EAX]"},

      // ----------- XADD
      // Normal disassembly with LOCK valid instruction and register destination
      {{0x0f, 0xc1, 0xd8}, S_SUCCESS, "XADD EAX,EBX"},
      // XADD.LOCK EAX,EBX not valid because it is not a memory destination
      {{0xf0, 0x0f, 0xc1, 0xd8}, S_FAILURE, ""},

      // Normal disassembly with LOCK valid instruction and memory destination
      {{0x0f, 0xc1, 0x18}, S_SUCCESS, "XADD dword ptr [RAX],EBX"},
      // XADD.LOCK dword ptr [RAX],EBX is valid because memory destination
      {{0xf0, 0x0f, 0xc1, 0x18}, S_SUCCESS, "XADD.LOCK dword ptr [RAX],EBX"},

      // ----------- LOCK fails
      // Test normal invalid LOCK instruction
      {{0xB8, 0x01, 0x00, 0x00, 0x00}, S_SUCCESS, "MOV EAX,0x1"},
      // MOV.LOCK EAX, 0x1 is not valid because MOV can't have a LOCK prefix
      {{0xf0, 0xb8, 0x01, 0x00, 0x00, 0x00}, S_FAILURE, ""},

      // Test AVX invalid LOCK instruction
      {{0xc5, 0xf9, 0x6f, 0xc1}, S_SUCCESS, "VMOVDQA XMM0, XMM1"},
      // VMOVDQA.LOCK XMM0, XMM1 is not valid because VMOVDQA can't have a LOCK prefix
      {{0xf0, 0xc5, 0xf9, 0x6f, 0xc1}, S_FAILURE, ""},

ekilmer avatar Jun 11 '22 20:06 ekilmer

One random point of interest: I've observed some glibcs selectively jump after the lock prefix on a lock cmpxchg, i.e. some paths go prefixed while others go to the unprefixed instruction.

pgoodman avatar Jun 12 '22 18:06 pgoodman

I've also been looking into the XACQUIRE and XRELEASE prefixes, which are related to LOCK. They require new pcodeops and context register fields.

I am working on that implementation, which will look similar to this, and will make a new commit to this branch when ready.

Again, I'm only going to make the fix for a few instructions until I learn that this implementation is on the right track or I should be doing something different.

ekilmer avatar Jun 14 '22 13:06 ekilmer

After applying there are only BAD-Instruction on previously analyzed binaries. On the other side binaries analyzed with applied patch have problems if opened without patch. Is there a way to fix that?

jobermayr avatar Jun 16 '22 12:06 jobermayr

@jobermayr You would want to increment the minor version number in the ldefs file for the language. This definitely requires an upgrade. I think major version updates are mostly only needed if register definitions are moved around (this PR adds to the context register but doesn't move anything).

GhidorahRex avatar Jun 16 '22 12:06 GhidorahRex

I've pushed a commit for how I think XACQUIRE and XRELEASE prefix should be handled, but in my first attempt, I cannot figure out how to disassemble XCHG correctly for all of the cases. From my commit message, copied here:

XACQUIRE and XRELEASE are related to the LOCK instructions so this proof of concept fix attempts to reuse the LOCK prefix logic to combine the checks.

I need help with one thing that I could not figure out:

For XCHG, you are allowed to have an XACQUIRE or XRELEASE prefix without LOCK, and I thought that by listing both prefix constructors that sleigh would choose those constructors when also faced with a memory operand (where the prefixes are valid only), but this did not work and I had to explicitly state that all three XACQUIRE, XRELEASE, and LOCK prefixes were zero even though both XACQUIRE and XRELEASE constructors handle that case. However, this is wrong because the byte for XACQUIRE and XRELEASE is still valid when there is no memory destination, but removing this breaks disassembly for the memory destinations. The following are my expectations of disassembly for x86-64:

  * "87d9"       - "XCHG ECX,EBX"
  * "f287d9"     - "XCHG ECX,EBX" # Current implementation fails
  * "f387d9"     - "XCHG ECX,EBX" # Current implementation fails
  * "f087d9"     - ""
  * "f2f087d9"   - ""
  * "f0f287d9"   - ""
  * "f2f387d9"   - "XCHG ECX,EBX" # Current implementation fails
  * "8719"       - "XCHG dword ptr [RCX],EBX"
  * "f08719"     - "XCHG.LOCK dword ptr [RCX],EBX"
  * "f28719"     - "XCHG.XACQUIRE dword ptr [RCX],EBX"
  * "f38719"     - "XCHG.XRELEASE dword ptr [RCX],EBX"
  * "f0f28719"   - "XCHG.XACQUIRE.LOCK dword ptr [RCX],EBX"
  * "f2f08719"   - "XCHG.XACQUIRE.LOCK dword ptr [RCX],EBX"
  * "f3f08719"   - "XCHG.XRELEASE.LOCK dword ptr [RCX],EBX"
  * "f3f2f08719" - "XCHG.XACQUIRE.LOCK dword ptr [RCX],EBX"
  * "f2f3f08719" - "XCHG.XRELEASE.LOCK dword ptr [RCX],EBX"

ekilmer avatar Jun 16 '22 13:06 ekilmer

increment the minor version number in the ldefs file for the language.

Bumped now.

major version updates are mostly only needed if register definitions are moved around

The xacquire and xrelease changes currently reuse some bits in the context register (because it's the same byte pattern as existing fields), so I don't think that means anything has moved around, right?

ekilmer avatar Jun 16 '22 13:06 ekilmer

reuse would not constitute moving around, yes. There are some other potential things that could require a major language update that involve changing the parse tree, but I don't think any of your PR changes would impact that so far.

GhidorahRex avatar Jun 16 '22 14:06 GhidorahRex

Is this MR also intended to fix

Bytes:          Meaning:                            Ghidra (also patched):
f0 48 0f b1 11  lock cmpxchg QWORD PTR [rcx],rdx    ??; CMPXCHG qword ptr [RCX],RDX
f0 0f b1 0a     lock cmpxchg DWORD PTR [rdx],ecx    ??; CMPXCHG dword ptr [RDX],ECX
f0 01 01        lock add DWORD PTR [rcx],eax        ??; ADD dword ptr [RCX],EAX
f0 01 08        lock add DWORD PTR [rax],ecx        ??; ADD dword ptr [RAX],ECX

jobermayr avatar Jun 24 '22 12:06 jobermayr

Is this MR also intended to fix

Bytes:          Meaning:                            Ghidra (also patched):
f0 48 0f b1 11  lock cmpxchg QWORD PTR [rcx],rdx    ??; CMPXCHG qword ptr [RCX],RDX
f0 0f b1 0a     lock cmpxchg DWORD PTR [rdx],ecx    ??; CMPXCHG dword ptr [RDX],ECX
f0 01 01        lock add DWORD PTR [rcx],eax        ??; ADD dword ptr [RCX],EAX
f0 01 08        lock add DWORD PTR [rax],ecx        ??; ADD dword ptr [RAX],ECX

Eventually, yes, but before I start spending more time fixing all LOCK and XACQUIRE/XRELEASE instructions, I'd like to wait and receive some feedback on the current implementation for the instructions that I have started fixing, INC, XADD, and XCHG, first.

Since this is my first time really digging into sleigh, I want to make sure I am not missing something obvious that would make this fix more minimal and/or cleaner.

Mostly, my concerns are around the implementation to enforce the following constraints

  • A LOCK prefix is invalid on any instruction that is not in the list of valid LOCK instruction (disassembly should fail)
  • the LOCK prefix is only valid for the above instructions with memory destinations
  • XACQUIRE and XRELEASE normally require a LOCK prefix, except for XCHG instruction, where the LOCK prefix is optional. (This is currently buggy)

ekilmer avatar Jun 24 '22 13:06 ekilmer

These latest commits should allow Ghidra to correctly disassemble the set of instructions that are LOCK-compatible.

I also just realized I missed a special case for XRELEASE only:

The XRELEASE prefix hint can only be used with the following instructions (also referred to as XRELEASE-enabled when used with the XRELEASE prefix):

  • The “MOV mem, reg” (Opcode 88H/89H) and “MOV mem, imm” (Opcode C6H/C7H) instructions. In these cases, the XRELEASE is recognized without the presence of the LOCK prefix.

While fixing the rest of the instructions and re-reading the ISA docs, I think a correct pcode implementation would include an opening LOCK(); at the beginning and closing UNLOCK(); at the end of the instruction's pcode.

More commits will come soon to fix the above two issues.

ekilmer avatar Sep 06 '22 14:09 ekilmer

OK, I've (finally) had a chance to look this over, think about it a bit, and discuss it internally. Using the with statement to protect against invalid disassembly is probably the best way to go here.

One suggestion is to pull all of the lockable instructions out of ia.sinc into a separate file.

Unfortunately, there are a number of related issues in the existing sleigh that should be fixed, particularly when adding support for the XACQUIRE and XRELEASE prefixes. Ideally, we would be able to use the existing context fields for the f2 and f3 prefixes. As far as I can tell, the manual only specifies the "last one wins" rule for which prefix applies when there are multiple f2,f3 prefixes in the cases where f2 and f3 are interpreted as XACQUIRE and XRELEASE. Experimentally, it seems to be the case that this rule holds for those prefixes in general (i.e., for the REPE and REPNE cases as well). It makes me sad to resolve undefined behavior by looking at one example, so there should probably be some more investigation, but it seems like the conditions in ia.sinc that prevent disassembly when both prefixes occur can be relaxed.

There are a number of variants of lockable instructions with the byte=0x80 | byte=0x82 constraint. The byte=0x82 condition is not valid for 64-bit code.

There's a substantial amount of overlap between x86.slaspec and x86-64.slaspec. Most of the content could be moved to x86.slaspec, which x86-64.slaspec could then include.

I'm not suggesting that you would need to do all of this for the PR to be accepted; just a heads up that other changes are needed in related code and some amount of coordination is probably needed. I guess you have several options (in no particular order):

  1. We take the PR (after the lockable instructions are moved to a separate file) but will add to it (and do more testing) before merging. We'll keep the initial commit so you get credit.
  2. You could forge ahead with XACQUIRE and XRELEASE, taking the above comments into account.
  3. Finally, I suppose you could decide that you never wanted to step in this mess and just convert the PR to an issue report.

ghidracadabra avatar Sep 06 '22 21:09 ghidracadabra

Thank you for the comments and suggestions @ghidracadabra!


Using the with statement to protect against invalid disassembly is probably the best way to go here.

👍


One suggestion is to pull all of the lockable instructions out of ia.sinc into a separate file.

Good call. Just to confirm, you only want the lockable versions of the instructions in a separate file? For example, using BTS as a small case:

# ia.sinc
@ifdef IA64
:BTS Rmr64,Reg64 ...
@endif
:BTS Reg16,imm8  ...
:BTS Reg32,imm8  ...
@ifdef IA64
:BTS Reg64,imm8  ...
@endif
# lockable.sinc
@ifdef IA64
:BTS^lockx Mem,Reg64 ...
@endif
:BTS^lockx m16,imm8  ...
:BTS^lockx m32,imm8  ...
@ifdef IA64
:BTS^lockx m64,imm8  ...
@endif

And I would move the with : lockprefx=0 { statement in the *.slaspec files to include ia.sinc and exclude the new lockable.sinc.


Unfortunately, there are a number of related issues in the existing sleigh that should be fixed, particularly when adding support for the XACQUIRE and XRELEASE prefixes. Ideally, we would be able to use the existing context fields for the f2 and f3 prefixes. As far as I can tell, the manual only specifies the "last one wins" rule for which prefix applies when there are multiple f2,f3 prefixes in the cases where f2 and f3 are interpreted as XACQUIRE and XRELEASE. Experimentally, it seems to be the case that this rule holds for those prefixes in general (i.e., for the REPE and REPNE cases as well). It makes me sad to resolve undefined behavior by looking at one example, so there should probably be some more investigation, but it seems like the conditions in ia.sinc that prevent disassembly when both prefixes occur can be relaxed.

Yeah, I searched the manual just now and couldn't find a reference for the "last one wins" rule applying to REPE and REPNE cases, but it probably makes sense that it follows the same rule. I can also take a look at this.


There are a number of variants of lockable instructions with the byte=0x80 | byte=0x82 constraint. The byte=0x82 condition is not valid for 64-bit code.

I think I'm missing something. Is this a bug in the existing pcode (in master branch) or a bug I introduced in this PR? From what I understand, the LOCK prefix is valid for 32-bit code as well.


There's a substantial amount of overlap between x86.slaspec and x86-64.slaspec. Most of the content could be moved to x86.slaspec, which x86-64.slaspec could then include.

Sounds good. This should probably be a single, separate commit.


As for what I will do next, I can rebase this PR to be a bit more organized with good commit messages, taking into account the suggestions (and clarification answers) above.

I think it's a good idea to continue with completion of the XACQUIRE and XRELEASE prefixes because they are pretty closely tied to LOCK and there isn't too much left.

I can notify you when I'm done with that, and you can do some internal testing of the PR and modify it as you see fit (option 1).

Thanks again!

ekilmer avatar Sep 08 '22 14:09 ekilmer

I think that just having the lockable versions in the separate sinc file is the way to go. I would put comments in ia.sinc for each instruction stating where the lockable variants can be found (so one comment for ADD, one for ADC, etc).

The byte=0x80 | byte=0x82 is a bug in the master branch that I noticed while reviewing your code. The LOCK prefix is valid for 32-bit code, but using opcode 0x82 instead of 0x80 is not valid in 64-bit code. Sorry for the confusion; I was just providing this as an example of an existing error in the sleigh in code that this PR touches. I can fix it after you've submitted.

ghidracadabra avatar Sep 09 '22 18:09 ghidracadabra

I believe you may need to just have a thought for 16-bit versions of x86 as well. (Tbh, I don't know if what you're proposing affects 16-bit or not, I just know there are quite a few oustanding issues in this regard and don't want it to be made any worse!!!)

Wall-AF avatar Sep 09 '22 20:09 Wall-AF

@ghidracadabra I think this is ready for internal review/testing.

I've rebased and split up the changes into a few commits based on some of your suggestions. Feel free to cherry-pick those for faster merge if you'd like. I also combined the LOCK and XACQUIRE/XRELEASE implementation into a single commit because, honestly, it would be a lot of changes and time to split into two commits, so I hope you don't mind.

If there's anything else I can do on my end to make it easier or faster to review, please let me know. Thank you.

ekilmer avatar Sep 11 '22 23:09 ekilmer

Is there an easy way to fix stack after updating from v20220616 to current patch version? image

Edit: maybe it is caused by missing Rmr64 constructor (left: unpatched, right: current patch): image

@ekilmer To fix an IP issue while building I need:

diff --git a/Ghidra/Processors/x86/certification.manifest b/Ghidra/Processors/x86/certification.manifest index 7d95a8f6a..2e66f4848 100644 --- a/Ghidra/Processors/x86/certification.manifest +++ b/Ghidra/Processors/x86/certification.manifest @@ -11,6 +11,7 @@ data/languages/cet.sinc||GHIDRA||||END| data/languages/clwb.sinc||GHIDRA||||END| data/languages/fma.sinc||GHIDRA||||END| data/languages/ia.sinc||GHIDRA||||END| +data/languages/lockable.sinc||GHIDRA||||END| data/languages/lzcnt.sinc||GHIDRA||||END| data/languages/macros.sinc||GHIDRA||||END| data/languages/mpx.sinc||GHIDRA||||END|

jobermayr avatar Sep 12 '22 10:09 jobermayr

@jobermayr Thank you for the comments! And yes, it looks like I made a very big mistake with regards to splitting out the memory and register destination for lockable instructions. I am still getting familiar with sleigh, so I thought I could replace something like spec_rm8 with Reg8 to get only the register, but this is very wrong because the masks don't align (as shown in your screenshot). Thank you for the review.

I am in the process of correcting this for the register-destination instructions. The lockable versions should be fine, but I will double check.

ekilmer avatar Sep 12 '22 12:09 ekilmer

@jobermayr Please try now. I force-pushed to fix the Rmr issue.

ekilmer avatar Sep 12 '22 13:09 ekilmer

@jobermayr Please try now. I force-pushed to fix the Rmr issue.

It works now. Thanks a lot for doing this. Would be really nice if the decompiler could one day also simplify the Interlocked functions like IDA: image

jobermayr avatar Sep 12 '22 20:09 jobermayr

@ekilmer I'm still reviewing this, but I wanted to mention two things.

  1. In the semantic section of the CMPXCHG variant on line 139 of lockable.sinc, there is a comment about an error due to duplicate build unlock; statements. You can avoid duplicate build statements by appending <end> build unlock; to the end of the section and replacing the goto inst_next; with goto <end>;.
  2. While testing, I noticed something peculiar. In a few linux libc .so files I examined (both 32 and 64 bit), there is an interesting optimization. The code looks something like this: 74 01 JZ +1 f0 4c 0f b1 29 CMPXCHG.LOCK qword ptr [RCX], R13 Basically, the code checks a value to determine whether it wants the locked variant of an instruction. If it wants the unlocked variant, it jumps 1 byte over the f0 prefix. With this PR, such code results in a disassembly error in the listing since you can't have overlapping instructions. In master this error does not occur since f0 is considered a one-byte instruction. You can see this in your branch by importing and analyzing a libc so from linux, opening the "Bookmarks" menu, and typing "conflicting" into the filter.

I experimented with a fix using a delayslot directive, but as far as I can tell the manual does not forbid repeating the lock prefix and nested delay slots result in a disassembly error. This isn't a reason to reject the PR, but I figured I'd mention it in case it's of interest to you.

Edit: It might be possible to avoid the nested delay slots error by introducing some context; not sure whether it's worth it though.

ghidracadabra avatar Sep 19 '22 20:09 ghidracadabra

@ekilmer I'm still reviewing this, but I wanted to mention two things.

  1. In the semantic section of the CMPXCHG variant on line 139 of lockable.sinc, there is a comment about an error due to duplicate build unlock; statements. You can avoid duplicate build statements by appending <end> build unlock; to the end of the section and replacing the goto inst_next; with goto <end>;.

Thank you! Fixed in latest force-push.

  1. While testing, I noticed something peculiar. In a few linux libc .so files I examined (both 32 and 64 bit), there is an interesting optimization. The code looks something like this: 74 01 JZ +1 f0 4c 0f b1 29 CMPXCHG.LOCK qword ptr [RCX], R13 Basically, the code checks a value to determine whether it wants the locked variant of an instruction. If it wants the unlocked variant, it jumps 1 byte over the f0 prefix. With this PR, such code results in a disassembly error in the listing since you can't have overlapping instructions. In master this error does not occur since f0 is considered a one-byte instruction. You can see this in your branch by importing and analyzing a libc so from linux, opening the "Bookmarks" menu, and typing "conflicting" into the filter.

I experimented with a fix using a delayslot directive, but as far as I can tell the manual does not forbid repeating the lock prefix and nested delay slots result in a disassembly error. This isn't a reason to reject the PR, but I figured I'd mention it in case it's of interest to you.

Edit: It might be possible to avoid the nested delay slots error by introducing some context; not sure whether it's worth it though.

Ahhh, yes, @pgoodman mentioned that in an earlier comment too 😬 https://github.com/NationalSecurityAgency/ghidra/pull/4336#issuecomment-1153251230.

ekilmer avatar Sep 20 '22 18:09 ekilmer

Ok, I've finished reviewing this. Overall looks good, thank you for taking this on!

There are a few minor issues to address before we can accept this:

  1. Although certain MOV instructions can take an XRELEASE prefix, they do not accept a LOCK prefix. For example, f0 48 89 18 is an invalid instruction and should not disassemble. So the MOV instructions should probably be moved back to ia.sinc.
  2. Possibly related to this, but there are some TODO statements about identical patterns (e.g. line 3038 of ia.sinc.). Were we missing variants? Is this an consequence of moving some of the MOV variants to lockable.sinc?
  3. XCHG with a memory operand asserts a lock signal whether or not there is a LOCK prefix, so the pcode section should just have calls to LOCK and UNLOCK rather than build directives.

There are some other possible improvements, such as better handling of the one-byte jump and having the "LOCK" appear as a prefix in the disassembly. These touch areas of the code other than those relevant to this PR and can be considered after this is merged.

Edit: I see your above comment related to #2. I'll investigate and make any necessary fixes after you've submitted.

ghidracadabra avatar Sep 22 '22 17:09 ghidracadabra

Ok, I've finished reviewing this. Overall looks good, thank you for taking this on!

You're welcome! And thank you again for the review(s).

There are a few minor issues to address before we can accept this:

  1. Although certain MOV instructions can take an XRELEASE prefix, they do not accept a LOCK prefix. For example, f0 48 89 18 is an invalid instruction and should not disassemble. So the MOV instructions should probably be moved back to ia.sinc.

Fixed. Force-pushed.


  1. Possibly related to this, but there are some TODO statements about identical patterns (e.g. line 3038 of ia.sinc.). Were we missing variants? Is this an consequence of moving some of the MOV variants to lockable.sinc?

It was a consequence of splitting the original pattern into one that only accepts memory destination operand (so that I could look for a valid XRELEASE prefix) and one that only supports a register destination operand.

In the original pattern,

:MOV spec_rm8,imm8       is vexMode=0 & byte=0xc6; (spec_rm8 & reg_opcode=0 ...); imm8        { spec_rm8 = imm8; }

I wasn't certain what reg_opcode=0 actually meant, so I was just following my process of replacing spec_rm8 with spec_m8 (for only capturing memory operands)

MOV^xrelease spec_m8,imm8       is vexMode=0 & xrelease & byte=0xc6; spec_m8 & reg_opcode=0 ...; imm8        { build xrelease; spec_m8 = imm8; }

and then, for the register destination, replacing spec_rm8 with Rmr8 and adding mod=3

:MOV Rmr8,imm8 is vexMode=0 & byte=0xc6; (mod=3 & Rmr8 & reg_opcode=0); imm8 { Rmr8 = imm8; }

which caused the identical pattern error with an existing pattern that looks very similar but with a CRmr8 instead of Rmr8

MOV CRmr8,imm8     is vexMode=0 & byte=0xc6; (CRmr8 & mod=3 & reg_opcode=0); imm8      { CRmr8 = imm8; }

So, I don't fully understand some of the pattern matching going on here, and thought it'd be good to call out the error 😕.


  1. XCHG with a memory operand asserts a lock signal whether or not there is a LOCK prefix, so the pcode section should just have calls to LOCK and UNLOCK rather than build directives.

Good catch! Fixed. Force-pushed.


There are some other possible improvements, such as better handling of the one-byte jump and having the "LOCK" appear as a prefix in the disassembly. These touch areas of the code other than those relevant to this PR and can be considered after this is merged.

Edit: I see your above comment related to #2. I'll investigate and make any necessary fixes after you've submitted.

Sounds good!

ekilmer avatar Sep 23 '22 14:09 ekilmer

@ekilmer Can you please adapt it to 71dcf8e ?

jobermayr avatar Dec 13 '22 23:12 jobermayr

@jobermayr Rebased and pushed

ekilmer avatar Dec 19 '22 18:12 ekilmer

From https://github.com/NationalSecurityAgency/ghidra/pull/4336#issuecomment-1244435811

Would be really nice if the decompiler could one day also simplify the Interlocked functions like IDA:

image

diff --git a/Ghidra/Processors/x86/data/languages/ia.sinc b/Ghidra/Processors/x86/data/languages/ia.sinc
index a7017dec1..c457f5248 100644
--- a/Ghidra/Processors/x86/data/languages/ia.sinc
+++ b/Ghidra/Processors/x86/data/languages/ia.sinc
@@ -1090,11 +1090,11 @@ xrelease: ".XRELEASE"	is xacquireprefx=0 & xreleaseprefx=1 { XRELEASE(); }
 xrelease: 	is epsilon { }
 
 # LOCK prefix
-lock:  ".LOCK"	is lockprefx=1 { LOCK(); }
+lock:  ".LOCK"	is lockprefx=1 { }
 lock:        	is epsilon { }
-unlock:		is lockprefx=1 { UNLOCK(); }
+unlock:		is lockprefx=1 { }
 unlock:		is epsilon { }
-lockx: xlockprefx^".LOCK"	is lockprefx=1 & xlockprefx { build xlockprefx; LOCK(); }
+lockx: xlockprefx^".LOCK"	is lockprefx=1 & xlockprefx { build xlockprefx; }
 lockx: 		is epsilon { }
 
 # Some macros
diff --git a/Ghidra/Processors/x86/data/languages/lockable.sinc b/Ghidra/Processors/x86/data/languages/lockable.sinc
index d62535415..0921f24e0 100644
--- a/Ghidra/Processors/x86/data/languages/lockable.sinc
+++ b/Ghidra/Processors/x86/data/languages/lockable.sinc
@@ -127,6 +127,7 @@
 :BTS^lockx^unlock m64,imm8     is $(LONGMODE_ON) & vexMode=0 & lockx & unlock & opsize=2 & byte=0xf; byte=0xba; m64 & reg_opcode=5 ...; imm8   { build lockx; local bit=imm8&0x3f; local val=(m64>>bit)&1; m64=m64 | (1<<bit); CF=(val!=0); build unlock; }
 @endif
 
+define pcodeop _InterlockedCompareExchange;
 :CMPXCHG^lockx^unlock m8,Reg8  is vexMode=0 & lockx & unlock & byte=0xf; byte=0xa6; m8 & Reg8 ...           { build lockx; _cmpxchg(m8, Reg8); build unlock; }
 :CMPXCHG^lockx^unlock m16,Reg16  is vexMode=0 & lockx & unlock & byte=0xf; byte=0xa7; m16 & Reg16 ...           { build lockx; _cmpxchg(m16, Reg16); build unlock; }
 :CMPXCHG^lockx^unlock m8,Reg8  is vexMode=0 & lockx & unlock & byte=0xf; byte=0xb0; m8 & Reg8 ...           { build lockx; subflags(AL,m8); local tmp=AL-m8; resultflags(tmp);
@@ -139,28 +140,11 @@
 					  build unlock; }
 :CMPXCHG^lockx^unlock m32,Reg32    is vexMode=0 & lockx & unlock & opsize=1 & byte=0xf; byte=0xb1; m32 & Reg32 ... & check_EAX_dest ...
 {
-    build lockx;
-	#this instruction writes to either EAX or m32
-	#in 64-bit mode, a 32-bit register that is written to
-	#(and only the register that is written to)
-	#must be zero-extended to 64 bits
-	subflags(EAX,m32);
-	local tmp=EAX-m32;
-	resultflags(tmp);
-    if (ZF==1) goto <equal>;
-    EAX = m32;
-    build check_EAX_dest;
-    goto <inst_end>;
-<equal>
-    m32 = Reg32;
-<inst_end>
-    build unlock;
+    EAX = _InterlockedCompareExchange(m32, Reg32, EAX);
 }
 @ifdef IA64
-:CMPXCHG^lockx^unlock m64,Reg64    is $(LONGMODE_ON) & vexMode=0 & lockx & unlock & opsize=2 & byte=0xf; byte=0xb1; m64 & Reg64 ...  { build lockx; subflags(RAX,m64); local tmp=RAX-m64; resultflags(tmp);
-                                          local diff = m64^Reg64; m64 = m64 ^ (zext(ZF) * diff);
-                                          diff = RAX ^ m64; RAX = RAX ^ (zext(ZF==0) * diff);
-					  build unlock; }
+define pcodeop _InterlockedCompareExchange64;
+:CMPXCHG^lockx^unlock m64,Reg64    is $(LONGMODE_ON) & vexMode=0 & lockx & unlock & opsize=2 & byte=0xf; byte=0xb1; m64 & Reg64 ...  { RAX = _InterlockedCompareExchange64(m64, Reg64, RAX); }
 @endif
 
 :CMPXCHG8B&lockx  m64        is vexMode=0 & lockx & unlock & byte=0xf; byte=0xc7; ( mod != 0b11 & reg_opcode=1 ) ... & m64 {

jobermayr avatar Apr 14 '23 15:04 jobermayr