ghidra icon indicating copy to clipboard operation
ghidra copied to clipboard

Global assignments treated very differently in 10.3

Open mumbel opened this issue 2 years ago • 4 comments

Describe the bug 10.3 introduced a regression in assigning global pointers (possibly other global issues). Before when a function returned a pointer and stored into a global, there would be a DAT_AABBCCDD symbol, and probably undefined4, but changing the type would usually infer the correct pointer type. In 10.3 it is now _DAT_AABBCCDD with this warning in the function comments, and undefined, without ability to refer type

/* WARNING: Globals starting with '_' overlap smaller symbols at the same address */

PowerPC:BE:32:QUICC (1.5) A quick test, which I might have messed up, is this possibly just BE, or possibly only 32-bit?

Bisected down to [02248d2251ec4da62c1d8fbcda1503c89dd7af9e] GP-3077 Added constant tracking through stack for stack parameters, fixed issues with values getting crossed moving in and out of memory, added prototype param type creation, added setting for restricting parameters to know pointers to handle harvard architectures and pointertypedefs

Environment (please complete the following information):

  • OS: 22.04
  • Java Version: 17.0.7
  • Ghidra Version: 02248d2251ec4da62c1d8fbcda1503c89dd7af9e
  • Ghidra Origin: eclipse

mumbel avatar Jun 06 '23 21:06 mumbel

If it is endianess related, this function certainly looks related and suspect.

	private Object getPointerDataTypeValue(DataType dataType, Address lastSetAddr,
			BigInteger bval) {

		int len = dataType.getLength();
		byte[] byteArray = new byte[len];

		//< why is this assumed big endian all the time?
		//< Would make me think BigEndianDataConverter is not doing the correct thing since this code is not just BE
		BigEndianDataConverter.INSTANCE.putBigInteger(byteArray, 0, len, bval);

		//< why is this assumed big endian all the time? (true)
		MemBuffer buf =
			new ByteMemBufferImpl(program.getMemory(), lastSetAddr, byteArray, true);

		// if not enough bytes for data type, can't do it
		if (len > byteArray.length) {
			return null; //< is this some java thing, it was created with length len, how would this hit?
		}

		Object value = dataType.getValue(buf, dataType.getDefaultSettings(), len);

		return value;
	}	

mumbel avatar Jun 07 '23 19:06 mumbel

If it was an endianess problem I would think the pointer would come back reversed. The code above is odd, but the converter changes the bytes to big endian from either LE/BE. Then the data type converter gets the value for the bytes, it wasn't my first implementation choice. It is because the value that is passed to a function may need to be transformed into a pointer by the settings on a pointeTypedef (shift, offset, address space, etc...). The plan is to put something on the PointerTypedef that would do the above on a value, without having to interpret bytes from a memory.

Do you have some example bytes?

From your description, I don't think the constant analysis knows the size of the pointer, just that it is a pointer. The change is probably that we used to lay down a full undefined4 at the other end of pointer, the decompiler is complaining that an undefined is placed there which is 1 byte not what it expects.

Although the decompiler seems to think it has a size, but it may be inferring that from your return data type size.

In the global at DAT_AABBCCDD, is a value actually stored there with a size, or is it just a pointer value returned?

It may be there is a bug when handling the store to the location.

Do you have some sample bytes for a function you could share?

emteere avatar Jun 07 '23 21:06 emteere

@emteere I was leaning towards endianess just because, in my test case at least I couldn't repro with x86_64 or riscv64 (I'm not setup to compile for 32-LE, or just doing the flags wrong).

This should be all thats needed to show what I'm seeing in my larger program.

struct foo {
    int w;
    int x;
    int y;
    int z;
};

struct foo *g_thing;

struct foo g_mem;

struct foo *get_foo(void)
{
    return &g_mem;
}

int main(int argc, char **argv)
{
    g_thing = get_foo();
    return 0;

}
powerpc-linux-gnu-gcc test.c 
powerpc-linux-gnu-strip a.out

a.zip

a.zip has a.out, PowerPC:BE:32:QUICC (1.5), function 100004e4

mumbel avatar Jun 07 '23 22:06 mumbel

I think it is the change from always putting down undefined4 (or the size of a pointer) for unknown access size to just undefined.

I'll take a look, thanks for the example.

emteere avatar Jun 07 '23 22:06 emteere

any news on this regression?

mumbel avatar Sep 12 '23 03:09 mumbel

Dug into it a little more, and it was definitely a regression. It had the size of the data type being written, but didn't create the undefined

emteere avatar Sep 15 '23 18:09 emteere

There a fix on the way into the main branch.

emteere avatar Sep 15 '23 19:09 emteere

Awesome, thanks for talking a look/fixing

mumbel avatar Sep 15 '23 20:09 mumbel