proxmark3 icon indicating copy to clipboard operation
proxmark3 copied to clipboard

More ARM -> Thumb ?

Open doegox opened this issue 5 years ago • 25 comments

doegox avatar Jun 26 '19 12:06 doegox

Thumb ISA is more limited, instructions are 2x smaller but you sometimes need more instructions, so, rule of thumb 🤭:

Thumb = more compact code, expect like 30% gain ARM = faster code

=> keep ARM for speed-critical or time-critical code

doegox avatar Jun 26 '19 12:06 doegox

thumb for saving space, ok.

iceman1001 avatar Jun 26 '19 14:06 iceman1001

BTW slurdge is on it, he did a first test moving everything to thumb, gained sth like 5 to 10%, not much, but this needs more test to make sure nothing breaks and to see if there are better tuning to do

doegox avatar Jul 04 '19 11:07 doegox

5-10% 12.5kb to 25kb.. Not too shabby I doubt some attack path with timecritical components like if thumb is slower..

iceman1001 avatar Jul 04 '19 12:07 iceman1001

thumb is not necessarily slower if the instructions fit nicely in the thumb encoding. it may even be faster (because the decode step is faster). The thing to watch for is the size of generated code which can be larger in specific sections of functions. With thumb I go down to ~81% on a 256k board, and basic testing doesn't show any differences. Of course it would be nice to have some benchmark :)

slurdge avatar Jul 05 '19 16:07 slurdge

Yes, we have two issues.

  1. flashing above 256kb limit doesn't work. Current flasher bricks. You need to jtag.
  2. if we don't loose speed , and get smaller, that is good for all 256kb devices. ie non-rdv4.

btw, I am very happy to see you involved @slurdge !

iceman1001 avatar Jul 05 '19 17:07 iceman1001

@slurdge Did you do some benchmark?

We would need to make a decision about this one. Either thumb or keep as it is.

iceman1001 avatar Jul 09 '19 12:07 iceman1001

I just tried regular command and it seems to work. I would be happy to try a benchmark but I'm not aware of any time measurement methods. My intuition would be to move almost everything (i.e groups) to thumb.

slurdge avatar Jul 09 '19 12:07 slurdge

we could move all LF stuff to thumb. Not much high tech stuff going on there.. well besides hitag2/s code.

iceman1001 avatar Jul 09 '19 12:07 iceman1001

Just to add my lacky thought : we're talking about thumb but thumb we don't actually use since we use thumb-interwork to make it work along with arm. How much may this break our assumptions here?

cjbrigato avatar Aug 04 '19 16:08 cjbrigato

BTW We could try to enforce thumb as much as possible and rely on the NetBSD way of finding thumb-incompatibilities :

In a large codebase like NetBSD it becomes difficult to manually check if any one object file can be compiled to thumb mode. Luckily brute force works with the help of make option -k, as in keep going even one object file does not compile. By compiling whole tree with CPUFLAGS=-mthumb and MAKEFLAGS=-k, all of the build time failing machine dependent object files can be found, and marked with the help of Per file build options override to be compiled to ARM mode with thumb interworking.

cjbrigato avatar Aug 04 '19 17:08 cjbrigato

@slurdge can you provide a case where thumb is actually faster than arm? In every benchmark I've tried, thumb2 can produce 85% (worst case) to 125%(famous faster than arm cases) the performance of arm for an average of 95%, But whatever the case, thumb (not thumb2) never did better then 83% arm performance with words case being 70% and average (11EEBMCs) score being 72% arm performance.

cjbrigato avatar Aug 04 '19 18:08 cjbrigato

Now that we have 512kb to play with, the size isn't super important.

iceman1001 avatar Aug 04 '19 18:08 iceman1001

not everybody has 512, and RRG is open to everybody ;)

doegox avatar Aug 04 '19 18:08 doegox

Doesn't mean we have to have a working 256kb fullimage.... although that would be a nice thing to offer.

iceman1001 avatar Aug 04 '19 18:08 iceman1001

@cjbrigato This is from memory when I was working on similar processor inside Nintendo DS. On my personal repo, I moved almost everything to thumb and nothing broke (in my rather limited test cases). And I would like very much push for a 256K basic image :-) We can use the 512K for more advanced cases.

slurdge avatar Aug 04 '19 19:08 slurdge

Exactly. Let's try staying <256k for PLATFORM!=PM3RDV4 (so without flash, spiffs, smartcard and usart)

doegox avatar Aug 04 '19 19:08 doegox

@slurdge i'm quite sure these processors came with THUMB2, as you've made reference to post-2003 architectures. But here we are on arm7tdmi and thumb is absolutely not Thumb2 unfortunately, and thumb is what we do when we -mthumb everything.

I found some slides with the benchmark I was sure I was remembering correctly : https://elinux.org/images/8/8a/Experiment_with_Linux_and_ARM_Thumb-2_ISA.pdf check slide N14 : Thumb-2 Performance for original thumb comparison performance wise.

So Here I think benchmark are to be done.

and About the thumb compatibility : everything absolutely compiles and run in full -mthumb without the interwork. In such case; a HF_COLIN + BT_ADDON Rdv4 fullimage is reduced to 240k down from 278k

In comparison, a full-arm -mno-thumb-interwork full image is 319k. At this stage, we are not able to run such an image. We would still need the interwork code, drastically reducing the interest. If we are to permit for some reason such an image, then the whole firmware has to be arm, including bootrom and any ASM part which would have been made THUMB-only.

cjbrigato avatar Aug 04 '19 21:08 cjbrigato

Confirmed :

 arm-none-eabi-readelf -a obj/fullimage.stage1.elf|grep Thumb
  Tag_THUMB_ISA_use: Thumb-1

So let's have in mind we are talking about Thumb-1 here, so not anything close to what @slurdge was talking about.

cjbrigato avatar Aug 04 '19 22:08 cjbrigato

Brace yourself, i'm ready to flash a bootrom in arm without thumb interwork (and as Thumb-enabled arm cpus indeed boot in ARM mode, this imply I will able to actually benchmark a true full thumb vs Full arm mode).

I can smell the bricking around.

cjbrigato avatar Aug 04 '19 22:08 cjbrigato

It dit not break. But it break the jump to no have the start.c in thumb mode. I spotted the bootrom enforcing of the jump :

        __asm("bx %0\n" : : "r"(((int)&_osimage_entry) | 0x1));

So I guess I have another run of bootrom flashing to make :'(

cjbrigato avatar Aug 04 '19 23:08 cjbrigato

It works. Will now bench everything like my life depends on answering the ARM vs Thumb-1 question.

cjbrigato avatar Aug 04 '19 23:08 cjbrigato

:-) I wasn't as extreme as you, just moved stuff from the ARMSRC to THUMBSRC. But I'm glad it works!

slurdge avatar Aug 05 '19 07:08 slurdge

This got a turn ...almost 10months later we pushed some of this suggestion.

iceman1001 avatar Jun 07 '20 16:06 iceman1001

So we moved alot... its down to @cjbrigato to state what we need to do ;)

iceman1001 avatar Jun 25 '20 22:06 iceman1001