litex icon indicating copy to clipboard operation
litex copied to clipboard

Unexpected behaviour of printf

Open JamesTimothyMeech opened this issue 1 year ago • 9 comments

I generated a litex SoC and deployed it to a Digilent Arty using this command:

python3 -m litex_boards.targets.digilent_arty --bios-format float  --cpu-type femtorv --cpu-variant gracilis --variant a7-100 --toolchain vivado --with-spi-sdcard --sdcard-adapter digilent --timer-uptime --build --load

I then used LiteOs to compile a simple C program to print a float: https://github.com/BrunoLevy/learn-fpga/tree/master/LiteX/software/LiteOS

The program successfully compiles and runs but it prints the incorrect value:

#include <stdio.h>
#include <stdlib.h>

int main(int argc, char** argv) {
   float f = 1.5;
   printf("%f\n",f);
   return 0;
}
liteOS> run float.elf
8589869000.000000

Does anyone have any advice about the best way to debug this?

JamesTimothyMeech avatar Jan 27 '24 19:01 JamesTimothyMeech

Hi James,

Welcome to the wonderful world of custom CPUs and FPGAs. As you work through bugs like this, you'll learn a LOT about how programs work.

In this particular case, I'd try several things:

  • a different variable type (ie. int or char)
  • printing the float as a hex int to see if the bit pattern is right or wrong
  • a different cpu-variant. If this "fixes" the output then the problem is either in the CPU (very unlikely) or in the per-cpu-type software libraries that support the CPU.

Also, your program doesn't have any BSS storage. Try defining

  • a global static BSS variable (i.e no explicit initialisation) int bss_var;, or
  • a global static data variable int data_var = 5, or
  • both

I've seen cases where crt0.s has a bug if one of BSS or Data is empty.

alanvgreen avatar Jan 27 '24 19:01 alanvgreen

I should add - using https://www.h-schmidt.net/FloatConverter/IEEE754.html you can see that the hex representation of 8589869000.000000 is 0x4fffff80, while 1.5 is 0x3fc00000.

alanvgreen avatar Jan 27 '24 20:01 alanvgreen

Thank for the tips! I'll document my attempts to debug using your tips here. I tried using a different CPU and recompiled LiteOS and my program:

python3 -m litex_boards.targets.digilent_arty --bios-format float --cpu-type vexriscv --variant a7-100 --toolchain vivado --with-spi-sdcard --sdcard-adapter digilent --timer-uptime --build --load

I got the same result:

 liteOS> run float.elf
8589869000.000000

I'll try the other suggestions now!

JamesTimothyMeech avatar Jan 27 '24 20:01 JamesTimothyMeech

Adding the global variables you mentioned don't seem to help but maybe my crt0.s is missing something important! Adding int bss_var;

#include <stdio.h>
#include <stdlib.h>

int bss_var;

int main(int argc, char** argv) {
   float f = 1.5;
   printf("%f\n",f);
   return 0;
}

Adding int data_var = 5;

#include <stdio.h>
#include <stdlib.h>

int data_var = 5;

int main(int argc, char** argv) {
   float f = 1.5;
   printf("%f\n",f);
   return 0;
}

Adding both

#include <stdio.h>
#include <stdlib.h>

int bss_var;
int data_var = 5;

int main(int argc, char** argv) {
   float f = 1.5;
   printf("%f\n",f);
   return 0;
}
liteOS> run float_bss.elf
8589869000.000000
liteOS> run float_data.elf
8589869000.000000
liteOS> run float_bss_data.elf
8589869000.000000

JamesTimothyMeech avatar Jan 27 '24 21:01 JamesTimothyMeech

I should add - using https://www.h-schmidt.net/FloatConverter/IEEE754.html you can see that the hex representation of 8589869000.000000 is 0x4fffff80, while 1.5 is 0x3fc00000.

When I run this program:

#include <stdio.h>
#include <stdlib.h>

int main(int argc, char** argv) {
   float f = 1.5;
   union {
        float f;
        uint32_t u;
    } f2u = { .f = f };

   //printf ("float : %f\n", f);
   printf ("hex : 0x%0lx\n", f2u.u);
   return 0;
}

I get this result which is correct:

liteOS> run float_hex.elf
hex : 0x3fc00000

I'll dig into the crt0.S tomorrow to see if anything important is missing:

// crt0.S for executables
// interrupts and stack are already configured by OS
// _start does the following tasks:
//  1) save registers (ra, t0..t6, a0..a7)
//  2) initialize BSS
//  3) call main
//  4) restore registers
//  5) return to caller (LiteOS shell)

        .global _start
_start:
        // save context
	addi sp, sp, -16*4
	sw ra,  0*4(sp)
	sw t0,  1*4(sp)
	sw t1,  2*4(sp)
	sw t2,  3*4(sp)
	sw a0,  4*4(sp)
	sw a1,  5*4(sp)
	sw a2,  6*4(sp)
	sw a3,  7*4(sp)
	sw a4,  8*4(sp)
	sw a5,  9*4(sp)
	sw a6, 10*4(sp)
	sw a7, 11*4(sp)
	sw t3, 12*4(sp)
	sw t4, 13*4(sp)
	sw t5, 14*4(sp)
	sw t6, 15*4(sp)
	

	// initialize .bss
	la t0, _fbss
	la t1, _ebss
1:	beq t0, t1, 3f
	sw zero, 0(t0)
	addi t0, t0, 4
	j 1b
3:

        call main
	
	// restore context
	lw ra,  0*4(sp)
	lw t0,  1*4(sp)
	lw t1,  2*4(sp)
	lw t2,  3*4(sp)
	lw a0,  4*4(sp)
	lw a1,  5*4(sp)
	lw a2,  6*4(sp)
	lw a3,  7*4(sp)
	lw a4,  8*4(sp)
	lw a5,  9*4(sp)
	lw a6, 10*4(sp)
	lw a7, 11*4(sp)
	lw t3, 12*4(sp)
	lw t4, 13*4(sp)
	lw t5, 14*4(sp)
	lw t6, 15*4(sp)
	addi sp, sp, 16*4
	
	ret

JamesTimothyMeech avatar Jan 27 '24 21:01 JamesTimothyMeech

The issue is most likely in printf(), particularly if it has some conditional compilation features, but the other vague possibility is the float to double conversion when calling printf.

AndrewD avatar Jan 27 '24 22:01 AndrewD

I just had a brief look at picolibc: Maybe add an experiment try --bios-format double.

AndrewD avatar Jan 27 '24 22:01 AndrewD

Ah thanks I should have known to try that! Running:

python3 -m litex_boards.targets.digilent_arty --bios-format double --cpu-type femtorv --cpu-variant gracilis --variant a7-100 --toolchain vivado --with-spi-sdcard --sdcard-adapter digilent --timer-uptime --build --load

and recompiling LiteOS and my program produced the expected result:

liteOS> run float.elf
1.500000

JamesTimothyMeech avatar Jan 27 '24 22:01 JamesTimothyMeech

For the float option you probably need a gcc flag too. --float-double or something like that

AndrewD avatar Jan 27 '24 22:01 AndrewD