ghidra icon indicating copy to clipboard operation
ghidra copied to clipboard

Convert countLeadingZeros followed by shift to a comparison

Open Pokechu22 opened this issue 3 years ago • 28 comments

Fixes #2801 and also fixes #2121. This introduces Pcode ops for countLeadingZeros and countLeadingOnes, and also a rule to simplify countLeadingZeros when it's used to implement equality testing. I only implemented countLeadingOnes because there was already an implementation of it on the Java side and it felt odd to convert countLeadingZeros but not countLeadingOnes.

I'm not 100% sure I've done everything right here, but it seems to work for the most part. Feedback is appreciated.

The only remaining issue I know of is that types aren't deduced right; also if the return type is set to bool then SUB41 appears (but that is not specific to this implementation and happens with other powerpc things IIRC).

BADTYPE equals3_32(int param_1)

{
  return param_1 == 3;
}
uint equals3_64(longlong param_1) // types manually specified

{
  return param_1 == 3;
}
BADTYPE equals3_8(char param_1) // optimisations disabled

{
  return param_1 == '\x03';
}

The logic for introducing a new Pcode op was based on 9c23383fa50f93cafa5540731a75219bfc99a729, 0fdd29b98dc5a42f4a1f1eb237fb6ccf170d0ad6, 28d479f188b7566a94ad061fdc0b1da53c92d394, and 04ddbf5981398744453a181796c89d28fe8d3fe5. I also had to run generateParsers to generate new code from the lex and yacc specifications, which was not necessary until 4b1beb742f0110ea7653a5010e143bd70b37aefe.

Pokechu22 avatar Mar 05 '21 08:03 Pokechu22

Is BADTYPE a user defined typedef on byte or is the decompiler actually outputting that?

As for the SUB43 it is likely due to an explicit size in the sleigh implementation of an instruction or in the new pcode implementation. If it is possible to have the decompiler infer the size it might go away. The pcode implementations for carry, borrow, and sborrow return a bool varnode so you could check those for reference.

astrelsky avatar Mar 06 '21 16:03 astrelsky

The decompiler is actually outputting BADTYPE; it looks like genericTypeName uses BADTYPE in its default case. I'll look into those implementations and see if there's something done there I can copy.

EDIT: To be clear, the BADTYPE doesn't come from the pcode implementation; it comes from the simplification rule. This:

                             undefined fun()
             undefined         r3:1           <RETURN>
                             fun
        00010000 68 63 00 03     xori       r3,r3,0x3
        00010004 7c 63 00 34     cntlzw     r3,r3
        00010008 4e 80 00 20     blr

decompiles to

int fun(uint param_1)

{
  return COUNTLEADINGZEROS(param_1 ^ 3);
}

which is just fine.

Pokechu22 avatar Mar 06 '21 18:03 Pokechu22

Hmm, I'm still not sure how to fix it. I think it comes from me changing CPUI_INT_RIGHT or CPUI_INT_SRIGHT to CPUI_INT_EQUAL, while all the other cases in ruleaction only change something that was a comparison into a new comparison. I also don't see anything in ruleaction that explicitly sets a type. Any idea how I can fix that?


EDIT: It looks like I also forgot to push after making a change that fixed COUNTLEADINGZEROS and COUNTLEADINGONES not being listed in OpCode.java, which would have prevented this from running correctly. Oops.

Pokechu22 avatar Mar 08 '21 23:03 Pokechu22

Hmm, I'm still not sure how to fix it. I think it comes from me changing CPUI_INT_RIGHT or CPUI_INT_SRIGHT to CPUI_INT_EQUAL, while all the other cases in ruleaction only change something that was a comparison into a new comparison. I also don't see anything in ruleaction that explicitly sets a type. Any idea how I can fix that?

EDIT: It looks like I also forgot to push after making a change that fixed COUNTLEADINGZEROS and COUNTLEADINGONES not being listed in OpCode.java, which would have prevented this from running correctly. Oops.

I'm not super familiar with the decompiler yet. However I suspect it may be related to the output varnode when the change occurs. Currently it appears that the output varnode isn't being changed when maybe it should be? You could explicitly set the new varnode's datatype as well.

astrelsky avatar Mar 09 '21 00:03 astrelsky

I'm not entirely sure how you get a Datatype object for a builtin type. There also isn't a direct setter for the type, only setTempType and updateType, nor is there a setter for the size.

I did try replacing this:

https://github.com/NationalSecurityAgency/ghidra/blob/3565ecf826f20a98ce62b696e255abd78f01be5c/Ghidra/Features/Decompiler/src/decompile/cpp/ruleaction.cc#L9153-L9158

with this:

      // Becomes a comparison with zero
      Varnode* b = data.newConstant(outVn->getSize(), 0);
      data.opSetOpcode(baseOp, CPUI_INT_EQUAL);
      data.opSetInput(baseOp, op->getIn(0), 0);
      data.opSetInput(baseOp, b, 1);
      Varnode* oldOut = baseOp->getOut();
      Varnode* newOut = data.newVarnodeOut(oldOut->getSize(), oldOut->getAddr(), baseOp);
      data.totalReplace(oldOut, newOut);
      data.opSetOutput(baseOp, newOut);
      return 1;

or this:

      // Becomes a comparison with zero
      PcodeOp* newOp = data.newOp(2, baseOp->getAddr());
      data.opSetOpcode(newOp, CPUI_INT_EQUAL);
      Varnode* b = data.newConstant(outVn->getSize(), 0);
      data.opSetInput(newOp, op->getIn(0), 0);
      data.opSetInput(newOp, b, 1);

      Varnode* oldOut = baseOp->getOut();
      Varnode* newOut = data.newVarnodeOut(oldOut->getSize(), oldOut->getAddr(), newOp);
      data.opSetOutput(newOp, newOut);
      data.opInsertBefore(newOp, baseOp);
      data.totalReplace(oldOut, newOut);
      data.opDestroy(baseOp);
      return 1;

There didn't seem to be any changes from it. (This is under the assumption that newVarnodeOut uses information about newOp to decide the type to use; having taken a quick look at what newVarnodeOut actually does, I do have some further ideas for how to set the type properly, and it might not actually use information about newOp to chose the type.)

Trying one further thing:

      // Becomes a comparison with zero
      PcodeOp* newOp = data.newOp(2, baseOp->getAddr());
      data.opSetOpcode(newOp, CPUI_INT_EQUAL);
      Varnode* b = data.newConstant(outVn->getSize(), 0);
      data.opSetInput(newOp, op->getIn(0), 0);
      data.opSetInput(newOp, b, 1);

      Varnode* oldOut = baseOp->getOut();
      Varnode* newOut = data.newVarnodeOut(1, oldOut->getAddr(), baseOp);
      data.opSetOutput(newOp, newOut);
      data.opInsertBefore(newOp, baseOp);
      data.totalReplace(oldOut, newOut);
      data.opDestroy(baseOp);
      return 1;

This causes code to decompile as

undefined4 equals3_32(int param_1)

{
  return (undefined4)(param_1 == 3);
}

which is mildly surprising; if I explicitly set the return type to bool I get this:

bool equals3_32(int param_1)

{
  return SUB11(param_1 == 3,0);
}

I guess that's progress, but now I'm not sure why it's using undefined4 when I explicitly set the size to 1...

Pokechu22 avatar Mar 09 '21 07:03 Pokechu22

I've settled on this:

      // Becomes a comparison with zero
      PcodeOp* newOp = data.newOp(2, baseOp->getAddr());
      data.opSetOpcode(newOp, CPUI_INT_EQUAL);
      Varnode* b = data.newConstant(outVn->getSize(), 0);
      data.opSetInput(newOp, op->getIn(0), 0);
      data.opSetInput(newOp, b, 1);

      Varnode* oldOut = baseOp->getOut();
      Datatype* ct = data.getArch()->types->getBase(1, TYPE_BOOL);
      Varnode* newOut = data.newVarnode(1, oldOut->getAddr(), ct);
      data.opSetOutput(newOp, newOut);
      data.opInsertBefore(newOp, baseOp);
      data.totalReplace(oldOut, newOut);
      data.opDestroy(baseOp);
      return 1;

With this test case,

#include <stdbool.h>

__attribute__((weak)) void foo(bool a, bool b) { } // __weak to ensure it's not inlined/optimized out
bool flag1b, flag2b, flag3b;
int  flag1i, flag2i, flag3i;
void bar(int a, int b) {
	foo(a == 3, b == 2);

	flag1b = a == 1;
	flag2b = b == 2;
	flag3b = a == 3 & b == 4;  // prevent gcc from generating branches

	flag1i = a == 5;
	flag2i = b == 6;
	flag3i = a == 7 & b == 8;
}

decompiles to this at O0:

void bar(int param_1,int param_2)
{
  foo(param_1 == 3,param_2 == 2);
  flag1b = SUB11(param_1 == 1,0);
  flag2b = SUB11(param_2 == 2,0);
  flag3b = SUB11(param_2 == 4,0) & SUB11(param_1 == 3,0);
  flag1i = (undefined4)(param_1 == 5);
  flag2i = (undefined4)(param_2 == 6);
  flag3i = param_2 == 8 && param_1 == 7;
  return;
}

and this at O3:

void bar(int param_1,int param_2)
{
  foo(param_1 == 3,param_2 == 2);
  flag3i = param_1 == 7 && param_2 == 8;
  flag2i = (undefined4)(param_2 == 6);
  flag1i = (undefined4)(param_1 == 5);
  flag3b = SUB11(param_2 == 4,0) & SUB11(param_1 == 3,0);
  flag2b = SUB11(param_2 == 2,0);
  flag1b = SUB11(param_1 == 1,0);
  return;
}

The variables are detected as undefined1, but selecting "override signature" for foo fills in to void foo(bool,bool), so it's recognizing that it's a bool at least.

Pokechu22 avatar Mar 09 '21 22:03 Pokechu22

The rule isn't going to be applied here. There is no right shift.

Right, the point of that example was to show that when the rule wasn't applied, BADTYPE isn't produced, and thus COUNTLEADINGZEROS on its own isn't responsible for BADTYPE, so instead it's the rule that was responsible for it.

Pokechu22 avatar Mar 12 '21 02:03 Pokechu22

The rule isn't going to be applied here. There is no right shift.

Right, the point of that example was to show that when the rule wasn't applied, BADTYPE isn't produced, and thus COUNTLEADINGZEROS on its own isn't responsible for BADTYPE, so instead it's the rule that was responsible for it.

It seems I didn't delete it fast enough. I realized that after submitting.

Are you certain this is correct? It is from the comments countLeadingZeros(a ^ 3) >> 5 => a - 3 == 0

For the count leading zeros in combination with a right shift to be equivalent to a comparison there would need to be more checks involved.

astrelsky avatar Mar 12 '21 02:03 astrelsky

I'm not sure I understand you. If you're suggesting that the BADTYPE could come from just countLeadingZeros(a ^ 3) >> 5 without the rule, then I think that is possible, but it seems unlikely to me. I could test, but since I think I've fixed the BADTYPE issue by explicitly setting types, I'm not sure it's worthwhile.


If you're instead asking whether the rule countLeadingZeros(a ^ 3) >> 5 => a - 3 == 0 is correct, I can answer that. The key assumption is that it's running on a 32-bit system, in which case countLeadingZeros(x) can return between 0 and 32, inclusive. Only when x == 0 will there be 32 zeros, and since 32 = 25, only when x == 0 will countLeadingZeros(x) >> 5 be 1.

My actual implementation does not assume a 32-bit system but instead uses the possible-nonzero-bit mask, which I set here; the mask will be 63 for a 32-bit system, and 127 for a 64-bit system for instance. The rule only applies when the mask combined with the shift produces 1; since countLeadingZeros can only return between 0 and the number of bits, that should work.

I'm fairly certain that this is accurate for processors that use a power of 2 for the number of bits, but I guess it would fail for other processors (e.g. a 24-bit processor would get a mask of 31, which would require a shift of 4, but not all bits would need to be set to 0 in that case (only at least 16 bits)). Such processors wouldn't use countLeadingZeros for checking for if something is zero, but probably the rule needs to be disabled in that case.

Pokechu22 avatar Mar 12 '21 02:03 Pokechu22

Should it be countLeadingZeros(a ^ 3) >> 5 => a ^ 3 == 0 instead of countLeadingZeros(a ^ 3) >> 5 => a - 3 == 0?

astrelsky avatar Mar 12 '21 02:03 astrelsky

Oh. Yeah, that's a typo (but a ^ 3 == 0 is equivalent to a == 3 (RuleXorCollapse) and a - 3 == 0 is also equivalent to a == 3 (RuleEqual2Zero), so the statement is still correct (a ^ 3 == 0 if and only if a - 3 == 0, although in general a ^ 3 won't equal a - 3)).

Pokechu22 avatar Mar 12 '21 03:03 Pokechu22

I've been playing with the rule parser. I was trying to see if I could implement this using it but it is either too complex or the parser is too incomplete. Try using newUniqueOut with newOp instead of creating a new varnode.

There is also FuncData::opMarkCalculatedBool

astrelsky avatar Mar 12 '21 03:03 astrelsky

newUniqueOut(1, newOp) seems to work, though it results in the same casts to undefined4 that explicitly setting TYPE_BOOL produces, along with override signature detecting bool. I've changed to use that. Using newUniqueOut(oldOut->getSize(), newOp) still produces BADTYPE. As far as I can tell, opMarkCalculatedBool changes nothing.

Results in BADTYPE
      // Becomes a comparison with zero
      PcodeOp* newOp = data.newOp(2, baseOp->getAddr());
      data.opSetOpcode(newOp, CPUI_INT_EQUAL);
      Varnode* b = data.newConstant(outVn->getSize(), 0);
      data.opSetInput(newOp, op->getIn(0), 0);
      data.opSetInput(newOp, b, 1);
      data.opMarkCalculatedBool(newOp);

      Varnode* oldOut = baseOp->getOut();
      Varnode* newOut = data.newVarnodeOut(oldOut->getSize(), oldOut->getAddr(), newOp);
      data.opSetOutput(newOp, newOut);
      data.opInsertBefore(newOp, baseOp);
      data.totalReplace(oldOut, newOut);
      data.opDestroy(baseOp);
      return 1;
      // Becomes a comparison with zero
      PcodeOp* newOp = data.newOp(2, baseOp->getAddr());
      data.opSetOpcode(newOp, CPUI_INT_EQUAL);
      Varnode* b = data.newConstant(outVn->getSize(), 0);
      data.opSetInput(newOp, op->getIn(0), 0);
      data.opSetInput(newOp, b, 1);
      //data.opMarkCalculatedBool(newOp);

      Varnode* oldOut = baseOp->getOut();
      Varnode* newOut = data.newUniqueOut(oldOut->getSize(), newOp);
      data.opSetOutput(newOp, newOut);
      data.opInsertBefore(newOp, baseOp);
      data.totalReplace(oldOut, newOut);
      data.opDestroy(baseOp);
      return 1;
Results in casts to undefined4
      // Becomes a comparison with zero
      PcodeOp* newOp = data.newOp(2, baseOp->getAddr());
      data.opSetOpcode(newOp, CPUI_INT_EQUAL);
      Varnode* b = data.newConstant(outVn->getSize(), 0);
      data.opSetInput(newOp, op->getIn(0), 0);
      data.opSetInput(newOp, b, 1);
      data.opMarkCalculatedBool(newOp);

      Varnode* oldOut = baseOp->getOut();
      Datatype* ct = data.getArch()->types->getBase(1, TYPE_BOOL);
      Varnode* newOut = data.newVarnode(1, oldOut->getAddr(), ct);
      data.opSetOutput(newOp, newOut);
      data.opInsertBefore(newOp, baseOp);
      data.totalReplace(oldOut, newOut);
      data.opDestroy(baseOp);
      return 1;
      // Becomes a comparison with zero
      PcodeOp* newOp = data.newOp(2, baseOp->getAddr());
      data.opSetOpcode(newOp, CPUI_INT_EQUAL);
      Varnode* b = data.newConstant(outVn->getSize(), 0);
      data.opSetInput(newOp, op->getIn(0), 0);
      data.opSetInput(newOp, b, 1);
      data.opMarkCalculatedBool(newOp);

      Varnode* oldOut = baseOp->getOut();
      Varnode* newOut = data.newUniqueOut(1, newOp);
      data.opSetOutput(newOp, newOut);
      data.opInsertBefore(newOp, baseOp);
      data.totalReplace(oldOut, newOut);
      data.opDestroy(baseOp);
      return 1;
      // Becomes a comparison with zero
      PcodeOp* newOp = data.newOp(2, baseOp->getAddr());
      data.opSetOpcode(newOp, CPUI_INT_EQUAL);
      Varnode* b = data.newConstant(outVn->getSize(), 0);
      data.opSetInput(newOp, op->getIn(0), 0);
      data.opSetInput(newOp, b, 1);
      //data.opMarkCalculatedBool(newOp);

      Varnode* oldOut = baseOp->getOut();
      Varnode* newOut = data.newUniqueOut(1, newOp);
      data.opSetOutput(newOp, newOut);
      data.opInsertBefore(newOp, baseOp);
      data.totalReplace(oldOut, newOut);
      data.opDestroy(baseOp);
      return 1;

I also found that x86 has a LZCNT instruction, which was previously implemented with a loop; I've replaced it with countLeadingZeros.

Pokechu22 avatar Mar 14 '21 02:03 Pokechu22

I've had a look at the cause of the undefined4/BADTYPE cast issues, here are my findings:

  • It is correct to make the output Varnode of the CPUI_INT_EQUAL PCodeOp 1 byte long. In fact, the output has to have size 1 (as described here)
  • What is not correct/allowed however, and this is where the issue arises from, is that you pass the output of the CPUI_INT_EQUAL PCodeOp directly to the following operation (which will be an CPUI_INT_OR PCodeOp in the case of cntlzw followed by rlwinm, but could technically be anything). Both inputs and the outputs of that following instruction must have the same size, but this is not given here because the first input was changed to the 1-byte output of CPUI_INT_EQUAL,.

Therefore, a CPUI_INT_ZEXT PCodeOp is required before the value gets passed to the following operation. This both fixes the boolean type propagation issues as well as the SUB11(val, 0) issues you were having.

Here is how it could look like with these changes:

// Becomes a comparison with zero
PcodeOp* newOp = data.newOp(2, baseOp->getAddr());
data.opSetOpcode(newOp, CPUI_INT_EQUAL);
Varnode* b = data.newConstant(op->getIn(0)->getSize(), 0);
data.opSetInput(newOp, op->getIn(0), 0);
data.opSetInput(newOp, b, 1);

// CPUI_INT_EQUAL must produce a boolean result
Varnode* eqResVn = data.newUniqueOut(1, newOp);
data.opSetOutput(newOp, eqResVn);

data.opInsertBefore(newOp, baseOp);

// Because the old output had size op->getIn(0)->getSize(),
// we have to guarantee that a Varnode of this size gets outputted
// to the descending PcodeOps. This is handled here with CPUI_INT_ZEXT.
data.opRemoveInput(baseOp, 1);
data.opSetOpcode(baseOp, CPUI_INT_ZEXT);
data.opSetInput(baseOp, eqResVn, 0);
return 1;

And here the decompilations of the test case you posted (with this patch applied):

At -O0:

void bar(int param_1,int param_2)

{
  foo(param_1 == 3,param_2 == 2);
  flag1b = param_1 == 1;
  flag2b = param_2 == 2;
  flag3b = param_2 == 4 && param_1 == 3;
  flag1i = (uint)(param_1 == 5);
  flag2i = (uint)(param_2 == 6);
  flag3i = (uint)(param_2 == 8 && param_1 == 7);
  return;
}

At -O3:

void bar(int param_1,int param_2)

{
  foo(param_1 == 3,param_2 == 2);
  flag3i = (uint)(param_1 == 7 && param_2 == 8);
  flag2i = (uint)(param_2 == 6);
  flag1i = (uint)(param_1 == 5);
  flag3b = param_2 == 4 && param_1 == 3;
  flag2b = param_2 == 2;
  flag1b = param_1 == 1;
  return;
}

(Note that I didn't have to manually retype the variables, but there's still one tiny nitpick: for the flagxi variables, it casts to uint instead of int, I'm not quite sure why that happens)

RootCubed avatar Sep 03 '21 16:09 RootCubed

Thanks, I've applied that change (and regenerated the parser/grammar to fix the merge conflict).

(Note that I didn't have to manually retype the variables, but there's still one tiny nitpick: for the flagxi variables, it casts to uint instead of int, I'm not quite sure why that happens)

I think it does that to indicate that zero-extension is being used. If I use CPUI_INT_SEXT instead of CPUI_INT_ZEXT, then it'll generate int instead of uint. It's probably not worth worrying about, unless there's a way of not making the extension show up in the decompiler output at all.

Pokechu22 avatar Sep 04 '21 00:09 Pokechu22

is there any plan to merge this ? this pattern is pretty common on various arm32 binaries i saw and this patch just works.

alexmaloteaux avatar Feb 18 '22 23:02 alexmaloteaux

I actually made a PR for this two years ago: https://github.com/NationalSecurityAgency/ghidra/pull/2312 However I did not introduce a new PcodeOp, I just go off of the userop names for count leading zeros, and I did nothing with count leading ones.

ghost avatar Jul 06 '22 16:07 ghost

IMO introducing a new PcodeOp is cleaner than handling the userop names, since the userop names are inconsistent (ARM uses count_leading_zeroes while everything else uses countLeadingZeros). But, introducing a new PcodeOp is also why this PR is so big, as several large files had to be regenerated.

Pokechu22 avatar Jul 06 '22 17:07 Pokechu22

On further thought, a separate PcodeOp for countLeadingOnes is not needed, as countLeadingOnes(x) should be equivalent to countLeadingZeros(~x).

Pokechu22 avatar Jul 10 '22 21:07 Pokechu22

IMO introducing a new PcodeOp is cleaner than handling the userop names, since the userop names are inconsistent (ARM uses count_leading_zeroes while everything else uses countLeadingZeros). But, introducing a new PcodeOp is also why this PR is so big, as several large files had to be regenerated.

I agree that the userop name comparison is gross and not ideal. It does check for the ARM and PPC names btw. But also adding new opcodes for relatively rare operations seems like a slippery slope. Before long you have CPUI_BLSR, CPUI_PDEP,CPUI_PEXT, etc.

But I'd say in this case a much stronger argument exists for CLZ to be its own opcode than any of the ones I listed, and that theres a way better reason for them to exist than CPUI_EXTRACT and CPUI_INSERT, which really ought to just be shifts, masks, ors instead of introducing a totally new opcode that won't trigger existing optimization rules, or CPUI_POPCOUNT, which ought to instead be the normal horizontal sum sequence, and then optimization rules could be added in that detect that pattern. But for lzcnt that would be very messy, since branching is required.

Ideally, for these kinds of rare operations it would be best to have some kind of alternate enum instead of requiring a string comparison of the usercall arg.

ghost avatar Jul 11 '22 13:07 ghost

On further thought, a separate PcodeOp for countLeadingOnes is not needed, as countLeadingOnes(x) should be equivalent to countLeadingZeros(~x).

Probably best to do that then, ghidra likes opcode reuse, just look at the comparisons haha

ghost avatar Jul 11 '22 13:07 ghost

and that theres a way better reason for them to exist than CPUI_EXTRACT and CPUI_INSERT, which really ought to just be shifts, masks, ors instead of introducing a totally new opcode that won't trigger existing optimization rules

Hmm, that's actually interesting; sleigh_ref.html says that v0[6,1] is "(simulated)", and EXTRACT and INSERT don't appear on pcoderef.html, but only on additionalpcode.html which says:

The following opcodes are not generated as part of the raw translation of a machine instruction into p-code operations, so none of them can be used in a processor specification. But, they may be introduced at a later stage by various analysis algorithms.

pcoderef.html also mentions:

Generation of raw p-code is a necessary first step in graph construction, but additional steps are required, which introduces some new opcodes. Two of these, MULTIEQUAL and INDIRECT, are specific to the graph construction process, but other opcodes can be introduced during subsequent analysis and transformation of a graph and help hold recovered data-type relationships. All of the new opcodes are described in the section called “Additional P-CODE Operations”, none of which can occur in the original raw p-code translation.

So I think this is kinda already how it works, just in a somewhat strange and jank manner where they're both in the same opcode enum.

Pokechu22 avatar Jul 11 '22 20:07 Pokechu22

I've installed bison 3.0.4 from source (and am still using flex 2.6.4), which eliminates most of the noise from running gradle lexSleigh yaccDecompiler. Even after running it, I did need to reintroduce the header comment, and also I needed to run cd Ghidra/Features/Decompiler/src/decompile/cpp/ and flex -o slghscan.cc slghscan.l to avoid a large number of changes along the lines of #line 566 "slghscan.l" to #line 566 "src/decompile/cpp/slghscan.l".

I also split created a separate pull request (#4436) to make the documentation generation a bit more consistent. I've included the same commit here (which git rebase will automatically remove when the other PR is merged).

Pokechu22 avatar Jul 11 '22 23:07 Pokechu22

countLeadingOnes has been removed; this is ready for review again.

Pokechu22 avatar Jul 12 '22 01:07 Pokechu22

Just tried implementing the >>5 analysis class (seeing this a lot in PPC 32 BE is why) before it clicked I'd need the CLZ logic. Searched CLA and found this. Wondering if nsa should maybe do the generated file and leave out of the pull request (It'd look less scary anyhow) and if there's any minor changes needed in placement/defines it'd need to be redone anyways?

for the right shift, you have SRIGHT and RIGHT, have you seen SRIGHT, or just covering all cases if it were to be some instruction combo?

mumbel avatar Aug 17 '22 15:08 mumbel

ryanmkurtz any way to push this further ? the #2312 was closed , i guess in favor of this one, but there isnt any status so it could go unnoticed once more for a long time. Thanks

alexmaloteaux avatar Sep 14 '22 13:09 alexmaloteaux

Rebased to fix merge conflicts. 8d4a6c213ea252eec6dcb79079a6820a09418584 seems to have used a different version of flex/bison, but for now I re-used the ones I previously had installed (so there's a bit of noise from that).

Pokechu22 avatar Sep 14 '22 22:09 Pokechu22

just to confirm that it still works flawless for me on various arm32 binary compiled with linaro gcc

alexmaloteaux avatar Sep 15 '22 09:09 alexmaloteaux

Thanks, I will make those changes.

I didn't know about the Makefile; it would be nice to have that information documented somewhere. I also had to manually re-add the IP: GHIDRA comment when I regenerated the files with Gradle; I don't know whether the same would apply with the Makefile or not.

Pokechu22 avatar Mar 02 '23 21:03 Pokechu22

OK, I've determined that using the Makefile results in paths showing as grammar.y instead of src/decompile/cpp/grammar.y. So that explains why those were different.

Annoyingly, make doesn't want to build everything; even using make --always-make --keep-going fails with this output:

bison -p xml -o xml.cc xml.y
bison -p cparse -o grammar.cc grammar.y
bison -p pcode -o pcodeparse.cc pcodeparse.y
mkdir -p com_opt
mkdir -p com_dbg
make: 'grammar.cc' is up to date.

not generating slghparse.cc, slghscan.cc, or ruleparse.cc. I can manually run bison -d -o slghparse.cc slghparse.y, lex -oslghscan.cc slghscan.l, and bison -p ruleparse -d -o ruleparse.cc ruleparse.y, though.

I'll remove the changes to those generated files and let you regenerate them instead. This will probably result in a few commits that don't work properly without manually generating the files (affecting e.g. git bisect), though - I'm not sure whether that's a problem or not.

Pokechu22 avatar Mar 02 '23 22:03 Pokechu22