amaranth icon indicating copy to clipboard operation
amaranth copied to clipboard

Added gatemate vendor and Updated init file

Open tarik-hamedovic opened this issue 1 year ago • 16 comments

Hello,

Following our discussion on the amaranth-lang/amaranth repository, we reached out to CologneChip and requested a modification to their constraint files. Specifically, they made it possible to Net instead of Pin_in, Pin_out, and Pin_inout, enabling the direction of the pin to be determined from the RTL file rather than explicitly declared in the constraint file.

CologneChip implemented this change, and I tested it on the Olimex GateMateA1-EVB Board using a simple Blinky example. The new setup works fine now.

I added a _gatemate.py file to the amaranth/vendor directory. This file leverages yosys and the p_r tool, which can be downloaded as a pre-built binary from CologneChip's official site.

For programming, I used openFPGALoader. I included this part in the amaranth-boards definition for the board, as I noticed that most other boards follow this approach.

I just wanted to ask aswell, maybe I can open up a issue aswell. The generated Verilog file is pretty messy. I have a ULX3S board and the generated Verilog file for the same Blinky example generates much cleaner code. What can be the issue here? And can it be fixed?

tarik-hamedovic avatar Jul 23 '24 17:07 tarik-hamedovic

The generated Verilog file is pretty messy. I have a ULX3S board and the generated Verilog file for the same Blinky example generates much cleaner code. What can be the issue here? And can it be fixed?

What is "messy"?

In general, the generated Verilog is intended to be human-readable (since you'll eventually have to read it, even if it is very rare with Amaranth in practice) but not beautiful (since you aren't going to be reading it the majority of the time). But I can point you to the difference between ULX3S and GateMate platforms.

whitequark avatar Jul 23 '24 18:07 whitequark

This is the GateMate generated Verilog file

/* Automatically generated by Amaranth 0.6.0.dev13. Do not edit. */
/* Generated by Yosys 0.40+22 (git sha1 fa0c5c1d4, clang++ 14.0.0-1ubuntu1.1 -fPIC -Os) */

module top(clk0_0__io, led_0__io);
  reg \$auto$verilog_backend.cc:2352:dump_module$1  = 0;
  wire [21:0] \$1 ;
  wire \$2 ;
  wire \$3 ;
  reg [20:0] \$4 ;
  reg \$5 ;
  wire clk;
  (* keep = "TRUE" *)
  wire clk0_0__i;
  input clk0_0__io;
  wire clk0_0__io;
  reg [20:0] count = 21'h000000;
  output led_0__io;
  wire led_0__io;
  reg led_0__o = 1'h0;
  (* init = 1'h0 *)
  wire leds;
  wire rst;
  always @(posedge clk)
    count <= \$4 ;
  always @(posedge clk)
    led_0__o <= \$5 ;
  assign \$1  = count + 1'h1;
  assign \$2  = count == 21'h1fffff;
  assign \$3  = ~ led_0__o;
  \top.cd_sync  cd_sync (
    .clk(clk),
    .rst(rst)
  );
  \top.pin_clk0_0  pin_clk0_0 (
    .clk0_0__i(clk),
    .clk0_0__io(clk0_0__io)
  );
  \top.pin_led_0  pin_led_0 (
    .led_0__io(led_0__io),
    .o(led_0__o)
  );
  always @* begin
    if (\$auto$verilog_backend.cc:2352:dump_module$1 ) begin end
    \$5  = led_0__o;
    if (\$2 ) begin
      \$5  = \$3 ;
    end
    if (rst) begin
      \$5  = 1'h0;
    end
  end
  always @* begin
    if (\$auto$verilog_backend.cc:2352:dump_module$1 ) begin end
    \$4  = \$1 [20:0];
    if (\$2 ) begin
      \$4  = 21'h000000;
    end
    if (rst) begin
      \$4  = 21'h000000;
    end
  end
  assign leds = led_0__o;
  assign clk0_0__i = clk;
endmodule

module \top.cd_sync (rst, clk);
  (* keep = "TRUE" *)
  input clk;
  wire clk;
  (* keep = "TRUE" *)
  wire clk0_0__i;
  output rst;
  wire rst;
  \top.cd_sync.reset_sync  reset_sync (
    .async_ff_clk(clk),
    .stage1(rst)
  );
  assign clk0_0__i = clk;
endmodule

module \top.cd_sync.reset_sync (stage1, async_ff_clk);
  input async_ff_clk;
  wire async_ff_clk;
  wire async_ff_rst;
  wire clk;
  (* init = 1'h1 *)
  wire rst;
  reg stage0 = 1'h1;
  output stage1;
  reg stage1 = 1'h1;
  always @(posedge async_ff_clk)
    stage0 <= 1'h0;
  always @(posedge async_ff_clk)
    stage1 <= stage0;
  assign async_ff_rst = 1'h0;
  assign clk = async_ff_clk;
  assign rst = stage1;
endmodule

module \top.pin_clk0_0 (clk0_0__io, clk0_0__i);
  (* keep = "TRUE" *)
  output clk0_0__i;
  wire clk0_0__i;
  input clk0_0__io;
  wire clk0_0__io;
  (* keep = "TRUE" *)
  wire i;
  \top.pin_clk0_0.buf  \buf  (
    .clk0_0__io(clk0_0__io),
    .i(clk0_0__i)
  );
  assign i = clk0_0__i;
endmodule

module \top.pin_clk0_0.buf (clk0_0__io, i);
  input clk0_0__io;
  wire clk0_0__io;
  output i;
  wire i;
  assign i = clk0_0__io;
endmodule

module \top.pin_led_0 (led_0__io, o);
  output led_0__io;
  wire led_0__io;
  wire led_0__o;
  input o;
  wire o;
  \top.pin_led_0.buf  \buf  (
    .led_0__io(led_0__io),
    .o(o)
  );
  assign led_0__o = o;
endmodule

module \top.pin_led_0.buf (led_0__io, o);
  output led_0__io;
  wire led_0__io;
  input o;
  wire o;
  assign led_0__io = o;
endmodule

While this is for ULX3S:

/* Generated by Yosys 0.42+10 (git sha1 ef9045882, g++ 11.4.0-1ubuntu1~22.04 -fPIC -Os) */

(* top =  1  *)
(* src = "/home/user/SDR-HLS/HLSImplementation/Examples/1.Blinky/Blinky.py:24" *)
(* generator = "Amaranth" *)
module top(rst, leds, clk);
  reg \$auto$verilog_backend.cc:2352:dump_module$1  = 0;
  wire [21:0] \$1 ;
  wire \$2 ;
  wire [7:0] \$3 ;
  reg [20:0] \$4 ;
  reg [7:0] \$5 ;
  (* src = "/home/user/FPGA/amaranth/amaranth/hdl/_ir.py:283" *)
  input clk;
  wire clk;
  (* src = "/home/user/SDR-HLS/HLSImplementation/Examples/1.Blinky/Blinky.py:27" *)
  reg [20:0] count = 21'h000000;
  (* src = "/home/user/SDR-HLS/HLSImplementation/Examples/1.Blinky/Blinky.py:20" *)
  output [7:0] leds;
  reg [7:0] leds = 8'h00;
  (* src = "/home/user/FPGA/amaranth/amaranth/hdl/_ir.py:283" *)
  input rst;
  wire rst;
  (* src = "/home/user/SDR-HLS/HLSImplementation/Examples/1.Blinky/Blinky.py:27" *)
  always @(posedge clk)
    count <= \$4 ;
  (* src = "/home/user/SDR-HLS/HLSImplementation/Examples/1.Blinky/Blinky.py:20" *)
  always @(posedge clk)
    leds <= \$5 ;
  assign \$1  = count + (* src = "/home/user/SDR-HLS/HLSImplementation/Examples/1.Blinky/Blinky.py:36" *) 1'h1;
  assign \$2  = count == (* src = "/home/user/SDR-HLS/HLSImplementation/Examples/1.Blinky/Blinky.py:37" *) 21'h1fffff;
  assign \$3  = ~ (* src = "/home/user/SDR-HLS/HLSImplementation/Examples/1.Blinky/Blinky.py:39" *) leds;
  always @* begin
    if (\$auto$verilog_backend.cc:2352:dump_module$1 ) begin end
    \$5  = leds;
    if (\$2 ) begin
      \$5  = \$3 ;
    end
    if (rst) begin
      \$5  = 8'h00;
    end
  end
  always @* begin
    if (\$auto$verilog_backend.cc:2352:dump_module$1 ) begin end
    \$4  = \$1 [20:0];
    if (\$2 ) begin
      \$4  = 21'h000000;
    end
    if (rst) begin
      \$4  = 21'h000000;
    end
  end
endmodule

The GateMate one is much longer and makes 7 modules, while the ULX3S is shorter and makes only 1. Could you explain further?

tarik-hamedovic avatar Jul 23 '24 23:07 tarik-hamedovic

These two files are not comparable. Your ULX3S Verilog file doesn't use an Amaranth platform, and if it did they'd look fairly similar.

Amaranth extensively uses modules for code reuse. Some of the modules above are trivial because the GateMate platform doesn't customize I/O primitives (as do most other FPGA platforms, even the relatively simple SiliconBlue one); Amaranth typically uses a module per I/O pin group, which in turn can use a module per pin, etc. The reset synchronizer has similar properties. These modules get folded during syntesis and there's no downside to having them besides perhaps aesthetics.

whitequark avatar Jul 24 '24 07:07 whitequark

In order for this PR to be complete and mergeable I expect the following items to be done:

  • [x] platform should output .il for toplevel
  • [x] platform should read both .v and .sv in the appropriate mode
  • ~~platform should produce SDC files~~
  • ~~SDC files should be tested in that the P&R tool should accept them~~
  • [x] P&R invocation should succeed and save the log in the appropriate place
  • ~~P&R tool syntax for an attribute that prevents wires from being optimized out should be investigated and the appropriate syntax should be used in add_clock_constraint, if available~~
  • [x] amaranth-boards should have a board file PR that can at least produce a working blinky for the platform, with the correct 1 Hz period
  • [ ] the entire flow should be tested on Windows and succeed
  • [ ] platform should have additional documentation:
    • [ ] all of the overrides that are used in the platform should be enumerated in the docstring, like other platforms do it
    • [ ] all of the reports that are produced by the build should be enumerated in the docstring, like other platforms do it
  • [x] the README should be updated to mention the new platform
  • [ ] the Sphinx documentation should be updated to include autogenerated documentation for the new platform
  • [ ] platform should use a reset synchronizer (see https://github.com/amaranth-lang/amaranth/pull/1460#issuecomment-2249013384)
  • [ ] platform should instantiate I/O buffers directly (see https://github.com/amaranth-lang/amaranth/pull/1460#issuecomment-2249013384)
  • [ ] platform should support DDR I/O buffers (see https://github.com/amaranth-lang/amaranth/pull/1460#issuecomment-2249013384)
  • [ ] latencies for DDR I/O buffers should be validated on hardware (by @whitequark or @wanda-phi)

Once all of the above is done the platform will be merged. I will also consider backporting it into the 0.5.x release series so that it can be used more early, on the stable branch.

We can then review the amaranth-boards file in more detail, as it's time-consuming but doesn't need to block the platform acceptance.

whitequark avatar Jul 24 '24 21:07 whitequark

  • amaranth-boards should have a board file PR that can at least produce a working blinky for the platform, with the correct 1 Hz period

I just realized that the development board you are using and the board that I have aren't the same board. Since I need to be able to test ongoing development I guess I'll have to find time to write a board file for the one CologneChip sent me, or you could also contribute a second one to make that faster.

whitequark avatar Jul 24 '24 21:07 whitequark

You have the GateMate Evaluation Board from CologneChip?

tarik-hamedovic avatar Jul 24 '24 21:07 tarik-hamedovic

I have "GateMate FPGA Starter Kit" which is a blue board with a different layout than what you posted.

whitequark avatar Jul 24 '24 21:07 whitequark

My friend has that board at home with him, so I can also ask him for help with this atleast via testing by uploading. I can help with writting the other amaranth_boards file aswell

tarik-hamedovic avatar Jul 24 '24 21:07 tarik-hamedovic

Sounds good to me. For now you testing on your own board is OK (managing review of two boards is a lot more difficult than one), I'll hack together a small board file with just a LED or something to test it myself.

whitequark avatar Jul 24 '24 21:07 whitequark

I have reviewed the CologneChip Primitive Library just now. In addition to the list above, the following primitives will need to be appropriately integrated into the platform:

  1. I/O buffers: Screenshot_20240724_223940
  2. Internal reset generator: Screenshot_20240724_224003

For the I/O buffers, the Lattice platform is quite similar and can be used as a template. For the reset generator I will probably implement it myself.

This work should be done after the rest is finished. It's fairly finicky so I may end up doing it myself, but I will explain what needs to be done either way.

whitequark avatar Jul 24 '24 22:07 whitequark

If you put the following contents into a file .../cc-toolchain-linux/settings.sh:

export PATH=.../cc-toolchain-linux/bin/openFPGALoader:.../cc-toolchain-linux/bin/p_r:.../cc-toolchain-linux/bin/yosys:${PATH}

and then put export AMARANTH_ENV_GATEMATE=.../cc-toolchain-linux/settings.sh into your ~/.bashrc, then you will be able to build Amaranth designs for GateMate platform without permanently having that particular toolchain in your PATH and without manually updating it.

whitequark avatar Jul 24 '24 22:07 whitequark

@TarikHamedovic After removing toolchain_prepare from your board definition file, I was able to produce a bitstream. Can you confirm?

whitequark avatar Jul 24 '24 23:07 whitequark

@TarikHamedovic After removing toolchain_prepare from your board definition file, I was able to produce a bitstream. Can you confirm?

I confirm that that was the problem. In the amaranth_boards disscussion you also said that the toolchain_prepare for this isn't really needed.

I tested it out on the board aswell and it blinks.

tarik-hamedovic avatar Jul 24 '24 23:07 tarik-hamedovic

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 91.11%. Comparing base (cde68fb) to head (f639850). Report is 5 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1460      +/-   ##
==========================================
- Coverage   91.12%   91.11%   -0.02%     
==========================================
  Files          43       43              
  Lines       10990    11150     +160     
  Branches     2665     2715      +50     
==========================================
+ Hits        10015    10159     +144     
- Misses        805      821      +16     
  Partials      170      170              

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

codecov[bot] avatar Jul 25 '24 00:07 codecov[bot]

A couple of questions to keep the pull request going. I was busy with other projects the last two weeks, but I'll try my best to finish this pull request completely, but I am going to need some guidance.

Looking at the tasks said above, the following questions are:

all of the overrides that are used in the platform should be enumerated in the docstring, like other platforms do it

How do I actually override commands? Where do I use it, if that makes sense. I have seen the documentation of other platforms and they have available overrides but I don't know how to invoke them? I assume that the overrides are mostly GateMate specific things that you can place along with synthesis, implementation and programming commands?

the Sphinx documentation should be updated to include autogenerated documentation for the new platform

When I do the reports and available overrides in the docstring, I should place them in the sphinx documentation akin to the quicklogic example?

platform should use a reset synchronizer (see https://github.com/amaranth-lang/amaranth/pull/1460#issuecomment-2249013384)

platform should instantiate I/O buffers directly (see https://github.com/amaranth-lang/amaranth/pull/1460#issuecomment-2249013384)

platform should support DDR I/O buffers (see https://github.com/amaranth-lang/amaranth/pull/1460#issuecomment-2249013384)

Any guidance on the tasks above?

I also added the GateMate Eval Board you are using to my fork of amaranth-boards where I only placed the clock, led and button. It should work, I'll test it out with my colleague tonight.

Also trying it on windows gives me the following message:

C:\Users\User\Desktop\FPGA\amaranth\examples>python Blinky.py -b -p amaranth_boards.gatemate_a1_evb.GateMate_A1_EVB -dp
Run 'yosys -h' for help.
Traceback (most recent call last):
  File "C:\Users\User\Desktop\FPGA\amaranth\examples\Blinky.py", line 125, in <module>
    plat.build(Blinky(num_leds, clock_divider), do_program=do_program)
  File "C:\Users\User\AppData\Local\Programs\Python\Python311\Lib\site-packages\amaranth\build\plat.py", line 103, in build
    products = plan.execute_local(build_dir)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\User\AppData\Local\Programs\Python\Python311\Lib\site-packages\amaranth\build\run.py", line 115, in execute_local
    subprocess.check_call(["cmd", "/c", f"call {self.script}.bat"],
  File "C:\Users\User\AppData\Local\Programs\Python\Python311\Lib\subprocess.py", line 413, in check_call
    raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['cmd', '/c', 'call build_top.bat']' returned non-zero exit status 1.

tarik-hamedovic avatar Aug 08 '24 18:08 tarik-hamedovic

How do I actually override commands? Where do I use it, if that makes sense. I have seen the documentation of other platforms and they have available overrides but I don't know how to invoke them?

You can use AMARNATH_override=x environment variable, or plat.build(override="x") function call.

I assume that the overrides are mostly GateMate specific things that you can place along with synthesis, implementation and programming commands?

All of the overrides I expect this platform will need are already in the file. get_override() is what invokes them. They only need to be documented, which can be copied and pasted from another platform.

When I do the reports and available overrides in the docstring, I should place them in the sphinx documentation akin to the quicklogic example?

It is better to use the Lattice platform as an example as it's more complete (and I'm not really sure to be honest if anyone is even using QuickLogic). The format is the same, but Lattice has more of them, which might help.

platform should instantiate I/O buffers directly (see #1460 (comment))

platform should support DDR I/O buffers (see #1460 (comment))

The Lattice platform has examples for all of these tasks. At the time I don't have time available for a more detailed guidance.

Also trying it on windows gives me the following message:

I can't immediately see what's wrong with it. You should probably look into the Yosys log.

whitequark avatar Aug 08 '24 21:08 whitequark