rv32emu icon indicating copy to clipboard operation
rv32emu copied to clipboard

Optimize cmp function

Open huaxinliao opened this issue 9 months ago • 8 comments

The number of instructions has decreased, and the number of memory accesses has been reduced.

huaxinliao avatar May 17 '24 06:05 huaxinliao

I analyzed the cmp function by converting it into x86 instructions, as https://hackmd.io/ucXfO5ixSTePjNX_KFR_Vg?view#Optimize-cmp-function-441.

huaxinliao avatar May 17 '24 06:05 huaxinliao

I would be surprised if the compiler does not optimize this.

visitorckw avatar May 17 '24 06:05 visitorckw

I would be surprised if the compiler does not optimize this.

You can conduct experiments to verify it.

huaxinliao avatar May 17 '24 06:05 huaxinliao

Out of curiosity, I used objdump -D on my Ubuntu x86-64 machine to examine the object code generated by gcc with ENABLE_GDBSTUB=1 and ENABLE_LTO=0. From what I can see, there is no difference in the generated code?

visitorckw avatar May 17 '24 06:05 visitorckw

I believe the difference you observed is due to the O2 optimization not being enabled.

visitorckw avatar May 17 '24 06:05 visitorckw

How to make the O2 flag be enabled, @visitorckw.

huaxinliao avatar May 17 '24 06:05 huaxinliao

How to make the O2 flag be enabled, @visitorckw.

You can check the CFLAGS used for compiling rv32emu in the Makefile. If you are compiling a single C file with gcc, you could add the -O2 option.

visitorckw avatar May 17 '24 06:05 visitorckw

How to make the O2 flag be enabled, @visitorckw.

You can check the CFLAGS used for compiling rv32emu in the Makefile. If you are compiling a single C file with gcc, you could add the -O2 option.

With the addition of the -O2 flag, the generated code remains consistent. Thank you, @visitorckw .

huaxinliao avatar May 17 '24 07:05 huaxinliao

I tend to close this pull request since modern compiler optimizer can always generate good code. There is no need to apply such proposed change.

jserv avatar May 17 '24 13:05 jserv