PandA-bambu
PandA-bambu copied to clipboard
Paired memory requests with the same addresses
Does it make any sense that Bambu sends paired requests to memory with the same addresses?
It is also unclear for what reason the width of Mout_data_ram_size
is 14
(not 12 = 2*6
), while the width of M_Rdata_ram
is 128
(2*64
).
Is the start_port not connected or it is just yellow-colored?
Anyway, it looks strange that Mout_adr_ram is holding a pair of zero address. On which object the load is working? Pointer based object passed to the design or a variable allocated inside the design? Concerning the Mout_data_ram_size, the data bus is 64bit wide per two parallel channels so the possible size that a load can put on a bus is 64. 64 in binary requires 7 bits to be encoded.
Is the start_port not connected or it is just yellow-colored?
Just yellow-colored.
Anyway, it looks strange that Mout_adr_ram is holding a pair of zero address. On which object the load is working?
The start address is zero (the first green line). So, it's OK.
Pointer based object passed to the design or a variable allocated inside the design?
Pointer-based object from outside.
64 in binary requires 7 bits to be encoded.
OK, but it looks to be redundant.
A zero pointer may create some problems with the C code so pay attention on the compiled code.
Anyway, you may see what is the optimization end results by passing --print-dot and checking the HLS_output/dot/
If it helps, I can send the source code (this is a student work, 8x8-block processing in JPEG).
About the redundancy, it depends on the code. Two loads performed in parallel reading the same location may happen.
Sure. I just meant that 6 bits are enough (size=0 is not used).
If it helps, I can send the source code (this is a student work, 8x8-block processing in JPEG).
yes, it would help. Please share even the options passed to bambu.
Sure. I just meant that 6 bits are enough (size=0 is not used).
When was designed the simple interface we go for a one-hot encoding for the size.
bambu --top-fname compress --device-name=xc7z020-1clg484-YOSYS-VVD \
--experimental-setup=BAMBU-PERFORMANCE-MP jpegcompress.c
Dear Alexander, I tried out the example you uploaded with the latest version of bambu (which you can find in branch panda-0.9.7-dev) and it seems to me that the generated design works correctly with respect to the C implementation. I just had some problems synthesizing at the standard clock period of 10ns with YOSYS, because BRAMs are too slow for that frequency, but setting a higher period leads to a fully functional design. I encourage you to try the latest bambu version if you are still interested and if you want to try different memory configuration you can play with the predefined experimental setups or with the --channels-number and --channels-type options. You can find all the necessary information in the bambu help section which you can print by passing --help to bambu executable.
Dear Michele,
Thanks for your reply. I will try.