dtc icon indicating copy to clipboard operation
dtc copied to clipboard

Restore phandles from binary representations

Open ukleinek opened this issue 1 year ago • 21 comments

device trees generated with options -@ and -L contain information about the label names used originally and which values are phandle values. This information can be reused when compiling back to dts format to make the result much more human readable.

Also dtdiff is adapted to make use of this to generate smaller diffs without hunks e.g. for a different phandle allocation.

This superseeds pull request https://github.com/dgibson/dtc/pull/93.

ukleinek avatar Oct 21 '24 07:10 ukleinek

A possible followup improvement would be to restore phandles from __fixup__ .

ukleinek avatar Oct 21 '24 08:10 ukleinek

Just a quick headsup: I found a dtbo file where my code crashes with -@ -L. So don't merge yet please.

ukleinek avatar Oct 24 '24 06:10 ukleinek

OK, I think that's unrelated to my changes and just presents a situation to the decompiler that didn't happen before. Here is a reproducer:

$ git diff
diff --git a/treesource.c b/treesource.c
index 067647b60a17..fa1a48579489 100644
--- a/treesource.c
+++ b/treesource.c
@@ -104,6 +104,12 @@ static void write_propval_string(FILE *f, const char *s, size_t len)
 static void write_propval_int(FILE *f, const char *p, size_t len, size_t width)
 {
        const char *end = p + len;
+
+       if (len % width) {
+               fprintf(stderr, "Huh, len = %zu, width = %zu\n", len, width);
+               exit(1);
+       }
+
        assert(len % width == 0);
 
        for (; p < end; p += width) {

uwe@taurus:~/dts-decompile-labels$ cat test3.dts
/dts-v1/;

/ {
	tralala = <&somelabel>, "text making the next label unaligned",
		<&somelabel>, "some more text";

	somelabel: somenode {

	};
};
$ dtc -@ -L -I dts -O dtb test3.dts > test3.dtb
$ dtc -@ -L -I dtb -O dts test3.dtb 
/dts-v1/;

/ {
Huh, len = 37, width = 4
	tralala = <&somelabel 

So I think some heuristic considers the property "tralala" an array of ints but that doesn't handle that it ends at offset 41 which is unaligned.

ukleinek avatar Oct 24 '24 06:10 ukleinek

A possible followup improvement would be to restore phandles from __fixup__ .

I think you do need to implement this. For one thing processing but __local_fixups__ but not __fixups__ would just be surprising behaviour to a user, I think. Secondly, it could cause some bogus errors in some cases (see detailed comments).

dgibson avatar Oct 28 '24 09:10 dgibson

Do you have an idea for the unaligned phandle problem?

ukleinek avatar Oct 28 '24 09:10 ukleinek

Do you have an idea for the unaligned phandle problem?

Sorry, I'm not sure exactly what problem you mean.

dgibson avatar Oct 28 '24 09:10 dgibson

Do you have an idea for the unaligned phandle problem?

Sorry, I'm not sure exactly what problem you mean.

I mean the crash reported in https://github.com/dgibson/dtc/pull/151#issuecomment-2434383526 with the follow up in the next comment.

ukleinek avatar Oct 28 '24 14:10 ukleinek

Do you have an idea for the unaligned phandle problem?

Sorry, I'm not sure exactly what problem you mean.

I mean the crash reported in #151 (comment) with the follow up in the next comment.

Ah, right, sorry.

I think the basic problem is that previously the only case where we'd have phandle markers during decompile is with -I dts -O dts, in which case we'd also have type markers helping us figure out how to print each part of each property. The new code is putting in phandle markers, but without type markers. We should handle that, of course, but looks like there are some edge cases.

More specifically, it looks like write_propval() calls guess_value_type() to figure out a missing type only once for the whole property. Having a phandle marker does tell us that those specific 4 bytes should be in integer < ... > context. But we'll need to separately guess the type for each chunk between phandles. I suspect that might be irritatingly fiddly, but I'm afraid it's going to be a prerequisite for implementing this.

dgibson avatar Oct 30 '24 02:10 dgibson

This addresses the crash mentioned in https://github.com/dgibson/dtc/pull/151#issuecomment-2434383526 and a few suggestions by @dgibson (which I marked as resolved). Still missing is restoring of phandles from __fixups__.

ukleinek avatar Jan 12 '25 01:01 ukleinek

OK, this got bigger now than I anticipated and also contains a few fixes, but I'm happy that I did it now and I can get it out of my head and happily use it.

Quick demo:

$ cat test.dts
/dts-v1/;
/plugin/;

/ {
	tralal = <&somelabel>, "text making the",
		<&somelabel>, "some more text";

	tralala = <&somelabel>, "text making the next label unaligned",
		<&somelabel>, "some more text";

	somelabel: somenode {
		property = <&nonexisting>;
	};
};
$ dtc -@ test.dts > test.dtb

Decompiling this dtb with dtc 1.7.2 gives you:

$ /usr/bin/dtc test.dtb
/dts-v1/;

/ {
	tralal = [00 00 00 01 74 65 78 74 20 6d 61 6b 69 6e 67 20 74 68 65 00 00 00 00 01 73 6f 6d 65 20 6d 6f 72 65 20 74 65 78 74 00];
	tralala = <0x01 0x74657874 0x206d616b 0x696e6720 0x74686520 0x6e657874 0x206c6162 0x656c2075 0x6e616c69 0x676e6564 0x00 0x1736f6d 0x65206d6f 0x72652074 0x65787400>;

	somenode {
		property = <0xffffffff>;
		phandle = <0x01>;
	};

	__symbols__ {
		somelabel = "/somenode";
	};

	__fixups__ {
		nonexisting = "/somenode:property:0";
	};

	__local_fixups__ {
		tralal = <0x00 0x14>;
		tralala = <0x00 0x29>;
	};
};

With the changes from this PR applied it gets:

$ dtc test.dtb
/dts-v1/;
/plugin/;

/ {
	tralal = <&somelabel>, "text making the", <&somelabel>, "some more text";
	tralala = <&somelabel>, "text making the next label unaligned", <&somelabel>, "some more text";

	somelabel: somenode {
		property = <&nonexisting>;
		phandle = <0x01>;
	};

	__symbols__ {
		somelabel = "/somenode";
	};

	__fixups__ {
		nonexisting = "/somenode:property:0";
	};

	__local_fixups__ {
		tralal = <0x00 0x14>;
		tralala = <0x00 0x29>;
	};
};

and if test.dtb was compiled without -@ it uses &{/somenode} instead of &somelabel.

ukleinek avatar Jan 14 '25 11:01 ukleinek

BTW, I didn't implement stripping out the __fixups__, __local_fixups__ and __symbols__ nodes. Mostly because it doesn't seem to be right for -I dts -O dts compilations. Instead the two fixup nodes are removed before they are generated to make sure that the information isn't duplicated.

ukleinek avatar Jan 15 '25 09:01 ukleinek

I've applied the first three patches, because they're sensible fixes independent of the rest of the series. I've made some comments on the next few, but I haven't completed a full review yet.

The topmost commit is also simple and orthogonal to the rest of the series. That one might be eligible for fast tracking, too. I'll look into your other feedback later.

ukleinek avatar Jan 25 '25 15:01 ukleinek

The topmost commit is also simple and orthogonal to the rest of the series. That one might be eligible for fast tracking, too. I'll look into your other feedback later.

I split this into a separate PR, see #168.

ukleinek avatar Jul 01 '25 10:07 ukleinek

I think I addressed all feedback. The only open discussion is if __fixups__, __local_fixups__ and __symbols__ should be dropped and under which conditions.

ukleinek avatar Jul 01 '25 11:07 ukleinek

The first two patches looked fine and I was going to apply.. but the second one breaks a test case:

dtc -I dtb -O dts -o overlay_overlay_decompile.test.dts overlay_overlay.test.dtb:	PASS
dtc -I dts -O dtb -o overlay_overlay_decompile.test.dtb overlay_overlay_decompile.test.dts:PASS
dtbs_equal_ordered overlay_overlay.test.dtb overlay_overlay_decompile.test.dtb:	Starting testcase "./dtbs_equal_ordered", pid 98779
FAIL	Tag mismatch (1 != 2) at (672, 672)

It looks like the decompiled version is entirely missing the __fixups__ and __local_fixups__ which should be there. I'm guessing this eventually gets fixed by the later patches which will reconstruct the necessary source level references to remake the fixups. However, we don't like to break bisection like that, so that patch might have to be moved later to where it won't break things.

dgibson avatar Jul 02 '25 05:07 dgibson

The first two patches looked fine and I was going to apply.. but the second one breaks a test case:

dtc -I dtb -O dts -o overlay_overlay_decompile.test.dts overlay_overlay.test.dtb:	PASS
dtc -I dts -O dtb -o overlay_overlay_decompile.test.dtb overlay_overlay_decompile.test.dts:PASS
dtbs_equal_ordered overlay_overlay.test.dtb overlay_overlay_decompile.test.dtb:	Starting testcase "./dtbs_equal_ordered", pid 98779
FAIL	Tag mismatch (1 != 2) at (672, 672)

It looks like the decompiled version is entirely missing the __fixups__ and __local_fixups__ which should be there. I'm guessing this eventually gets fixed by the later patches which will reconstruct the necessary source level references to remake the fixups. However, we don't like to break bisection like that, so that patch might have to be moved later to where it won't break things.

This commit uncovers a problem that we already have since commit 915daadbb62d ("Start with empty local_fixups and fixups nodes"):

$ cat test.dts 
/dts-v1/;
/plugin/;

/ {
	node {
		property = <0xffffffff>, "string";
	};
	
	__fixups__ {
		somenode = "/node:property:0";
	};
};

$ dtc -I dts -O dts test.dts 
/dts-v1/;

/ {

	node {
		property = <0xffffffff>, "string";
	};
};

I'll think about a fix.

ukleinek avatar Jul 02 '25 09:07 ukleinek

I'll think about a fix.

The fix is to fold the info contained in __fixes__ (and __local_fixups__) into markers, so the commits in this PR fix and don't only cover the issue again.

ukleinek avatar Jul 02 '25 09:07 ukleinek

I'll think about a fix.

The fix is to fold the info contained in __fixes__ (and __local_fixups__) into markers, so the commits in this PR fix and don't only cover the issue again.

I fixed that by sorting the respective commits before this change and expanded their commit logs to describe this detail.

ukleinek avatar Jul 02 '25 09:07 ukleinek

There is one test failure when the top commit is missing because restoring phandles from __fixups__ of an overlay results in a dangling phandle which isn't compilable without /plugin/'. But it's only the top commit that adds /plugin/. Reordering doesn't help, because then again we suffer from the test failure that we already saw before when you tried to apply the "Set DTSF_PLUGIN if needed when compiling from dtb" without restoring data from __fixups__ and __local_fixups__.

ukleinek avatar Jul 04 '25 08:07 ukleinek

This commit uncovers a problem that we already have since commit 915daad ("Start with empty local_fixups and fixups nodes"):

$ cat test.dts 
/dts-v1/;
/plugin/;

/ {
	node {
		property = <0xffffffff>, "string";
	};
	
	__fixups__ {
		somenode = "/node:property:0";
	};
};

$ dtc -I dts -O dts test.dts 
/dts-v1/;

/ {

	node {
		property = <0xffffffff>, "string";
	};
};

I'll think about a fix.

Ouch, ok. I missed that nasty side effect of 915daad. Unfortunately I don't think the fix proposed in the latest version is quite sufficient.

It's an important principle in dtc's design that it should generally be safe and lossless to convert dtb -> dts -> dtb. Likewise it should be "lossless" to convert dts -> dts, in the sense that going directly dts -> dtb and going dts -> dts -> dtb should result in the same output. This is exactly what those test cases are designed for.

This is supposed to be true even if the input tree contains garbage. It needs the basic structure to be parseable, of course, but if the properties contain things that are semantically malformed, they should be passed through unchanged. 915daad broke that.

Your latest series partially fixes this, but not completely: it will still simply delete entries it doesn't understand in the input tree. It might also re-order entries that it does understand. I can see two ways to fully fix this:

  1. Change fixup generation, so that instead of unconditionally adding the new fixup, it first checks to see if a matching fixup is already in __fixups__ or __local_fixups__. With that in place, 915daad can be reverted. That fix could be applied immediately, then your code to understand the existing fixups could be done on top of that. Alas, that is kind of messy and a fair bit of work.

  2. This approach might be a bit easier... but I'm not sure it gets every case correct. Instead of wiping __fixups__ clean as a single step, you remove entries one at a time as you parse them into fixups in the live tree (so you now know they'll be re-generated). This could still result in re-orderings which isn't great, but maybe acceptable. I'm also not sure if it handles all breakages from 915daad that don't involve a dtb -> dts stage.

dgibson avatar Jul 15 '25 02:07 dgibson

FTR: I took this patch set to the mailing list as I prefer this workflow after this PR got unwieldy. v1: https://lore.kernel.org/devicetree-compiler/[email protected] v2: https://lore.kernel.org/devicetree-compiler/[email protected]

ukleinek avatar Sep 19 '25 10:09 ukleinek