clash-compiler
clash-compiler copied to clipboard
Add Xilinx dual clock FIFO primitive to `clash-cores`
This adds Xilinx's DcFifo to clash-cores. The Haskell model and the IP are both tested.
Past discussion is here; it is bit cluttered.
It's a draft PR because it depends on https://github.com/clash-lang/clash-compiler/pull/2257
Still TODO:
- [ ] ~Write a changelog entry (see changelog/README.md)~ (clash-cores isn't on Hackage)
- [x] Check copyright notices are up to date in edited files
- [ ] Fix
full
behavior and add a test for it - [ ] Fix
overflow
behavior and add a test for it - [x] Remove unused / obsolete code
- [x] Test reset+full behavior
@DigitalBrains1 and I just talked a little about the tests that are currently there. Peter noticed that we only test the happy path; that is, when both the write side and the read side behave properly with regards to the full/empty signals:
- Not writing when full is asserted
- Not reading when empty is asserted
We should add two basic tests:
- What happens when you write when full is asserted. Both the Haskell model and Xilinx model should drop the incoming write, but otherwise continue functioning correctly.
- What happens if you read when empty. This is mostly relevant for the Haskell model: it should continue to work when this happens. I.e., the simulation shouldn't crash completely.
Oh fair! I will get on that.
Ok just pushed those two. It now tests that ordering remains upon overflow, and it tests that the simulation completes when it underflows, that is, the DcFifo doesn't depend on undefined values to progress!
What I have now is inelegant - it has multiple modules since I can't simulate multiple top-level entities in vivado. This is because of https://github.com/clash-lang/clash-compiler/issues/2264 (which I would like to put off for the moment).
I still need to see what happens in RTL when the driver circuit doesn't respect the full
signal.
This is because of https://github.com/clash-lang/clash-compiler/issues/2264 (which I would like to put off for the moment).
Seems sensible
This depends on my hack here, but now has the thorough tests (RTL)
Ok, the RTL tests:
- Test of different (and same) clock domains, read>write and write>read.
- Test that nothing goes "backwards" even if we write when the FIFO is full. So we don't always get the expected data out, but we test that it's in the right order.
What I have now is inelegant - it has multiple modules since I can't simulate multiple top-level entities in vivado. This is because of https://github.com/clash-lang/clash-compiler/issues/2264
That's not the only reason. If you want to have multiple top-level entities simulated from the same clash (gen) then those will need separate temporary directories and obviously also separate test names. I was planning on adding that anyway since I want to use it for Xilinx Floating in clash-cores
.
Our SymbiYosys and Vivado CI environments both only support a single element in the list BuildSpecific [String]
, the other tools all correctly handle the case of a list with multiple elements. I wasn't planning on fixing SymbiYosys until we actually need it, there's more important things to do.
Note: I did not understand #2264 at a glance, so I might be missing something you think is obvious ;-)
@DigitalBrains1 sorry about the line-length :grimacing:, I've got a script to pinpoint things now
I see I missed quite a few things in my review, sorry. I'll add them as TODOs to the PR description. Thanks @DigitalBrains1 for being vigilant, I should have been more careful.
Thank you! I've cherry-picked that commit.
I suddenly notice we declare the VHDL component as follows:
component topEntity_dcfifo_C5BC86E64AAA3F54 port
( wr_clk : in std_logic
; wr_rst : in std_logic
; rd_clk : in std_logic
; rd_rst : in std_logic
; din : in std_logic_vector
; wr_en : in std_logic
; rd_en : in std_logic
; dout : out std_logic_vector
; full : out std_logic
; empty : out std_logic
; rd_data_count : out std_logic_vector
; wr_data_count : out std_logic_vector );
end component;
while the Vivado instantiation template says:
COMPONENT fifo_generator_0
PORT (
wr_clk : IN STD_LOGIC;
wr_rst : IN STD_LOGIC;
rd_clk : IN STD_LOGIC;
rd_rst : IN STD_LOGIC;
din : IN STD_LOGIC_VECTOR(17 DOWNTO 0);
wr_en : IN STD_LOGIC;
rd_en : IN STD_LOGIC;
dout : OUT STD_LOGIC_VECTOR(17 DOWNTO 0);
full : OUT STD_LOGIC;
empty : OUT STD_LOGIC;
rd_data_count : OUT STD_LOGIC_VECTOR(9 DOWNTO 0);
wr_data_count : OUT STD_LOGIC_VECTOR(9 DOWNTO 0)
);
END COMPONENT;
Note how they constrain the size of the vectors. According to Rinse, this is no problem, and I can actually synthesise the design, but I suspect it was not deliberate; we generally constrain our vector sizes, right? The problem is not with the invocation of compInBlock
as it is getting the sizes:
-- -1 for maybe
let dataTy = N.BitVector (DSL.tySize (DSL.ety wDataM) - 1)
let
compInps =
[ ("wr_clk", N.Bit)
, ("wr_rst", N.Bit)
, ("rd_clk", N.Bit)
, ("rd_rst", N.Bit)
, ("din", dataTy)
, ("wr_en", N.Bit)
, ("rd_en", N.Bit)
]
compOuts = catMaybes
[ Just ("dout", dataTy)
, Just ("full", N.Bit)
, Just ("empty", N.Bit)
, dcReadDataCount `orNothing` ("rd_data_count", N.BitVector depth)
, dcWriteDataCount `orNothing` ("wr_data_count", N.BitVector depth)
, dcUnderflow `orNothing` ("underflow", N.Bit)
, dcOverflow `orNothing` ("overflow", N.Bit)
]
-- [...]
DSL.compInBlock dcFifoName compInps compOuts
PS: this component is configured differently than currently in the PR; but the PR will catch up with it later.
Fix for that seems trivial:
--- a/clash-lib/src/Clash/Backend/VHDL.hs
+++ b/clash-lib/src/Clash/Backend/VHDL.hs
@@ -1504,7 +1504,7 @@ inst_ (CompDecl nm ps0) =
fmap Just $ "component" <+> pretty nm <+>
("port" <> line <> indent 2 (tupledSemi ps <> semi))
<> line <> "end component" <> semi
- where ps = traverse (\(t,pd,ty) -> pretty t <+> ":" <+> ppd pd <+> qualTyName ty) ps0
+ where ps = traverse (\(t,pd,ty) -> pretty t <+> ":" <+> ppd pd <+> sizedQualTyName ty) ps0
ppd = \case { In -> "in"; Out -> "out"}
inst_ (CondAssignment id_ _ scrut _ [(Just (BoolLit b), l),(_,r)]) = fmap Just $
@DigitalBrains1 yes! I came to the same conclusion, I have a PR just now.
I pushed three commits:
- I refactored the code such that we can again use
$(deriveTermToData ''DcConfig)
. Martijn and I thought this was the cleaner solution. It will really pay off if we later extend the configuration to more options which are encoded as an ADT to allow the user to only specify meaningful combinations of configurations. - I made the way we specify
IpConfig
properties
more pleasant to read and write. In the next commit, I add a lot ofproperties
. I definedTclShow
instances for all types I expect to occur inproperties
. If you realise I missed a likely candidate, please say so. - The big one: we now use synchronous resets. The documentation (product guide) on asynchronous resets is just too ambiguous and inconsistent. If you take a favourable reading, picking the most sensible option for every oddness, then you get a nicely working system with asynchronous resets. If you take a conservative approach and say "I'd better do it like this because that way it has to go right whatever they meant by that odd bit in the product guide", you get such a complex beast that it is easier to do the synchronisation yourself. So, let's do that. We have synchronous resets, and the section on synchronous resets in the product guide is pretty clear. A proper reset sequence that respects the conditions in the product guide can easily be achieved. The specific proper reset sequence to apply in practice is left to the user (who should read the product guide carefully anyway). But in a later commit, I'll add documentation to the stuff.
I discussed proper testing with Christiaan, and we've decided just not to do any reset at all in the test bench (note that resetting is optional for this FIFO configuration anyway; we need the functionality to flush the FIFO, but the FIFO doesn't care). I will create an issue after this PR is merged noting that we still need to test resets in the test bench. I think a well-designed test bench for resets is not trivial, and I want to get this stuff merged without the engineering effort involved. I trust we do it correctly anyway, I don't expect the test bench to hit any issues (*knocks on wood*).
The FIFO will only work with domains that have a synchronous reset, but the constraint for that at the type level is too painful to carry around. Rather, both simulation and generating HDL will call error
.
I removed HLINT ignore camelCase
because it isn't relevant anymore.
I reordered the imports not just to conform with our latest style guide, but also because I could not determine the principle behind the existing ordering and grouping, so I had no clue where to add imports ;-). Vanessa, if you want to create subgroups within the major groups defined by the style guide, you'll need to redo that.
Note that I reformatted this bit:
BlackBoxes.hs
:
-- see lines 51-56, 102-107, 202-205 where these args are used.
[2,7,8,9,10,11,12]
The problem with this kind of comment is that it is incredibly high maintenance and in fact it is likely to be overlooked when changing the file, making the line numbers no longer match up. Showcasing this: the line numbers were already out of date and incorrect!
Great, thanks! I don't have any objections, just questions, I'll have a review shortly.
With this comment:
I replied to a comment that asked whether we should make a CHANGELOG entry for this.
@martijnbastiaan yes! I added the changelog entry that mentioned records but I was confused by @DigitalBrains1's comment:
we can now process records and type-level naturals
I don't see how the new code allows us to process type-level naturals.
I don't see how the new code allows us to process type-level naturals.
It was loose phrasing to refer to the fact we can now do termToData
on data with an SNat
at least when you define it as
data DcConfig depth = DcConfig { dcDepth :: !depth, ... }
and then supply an SNat n
as depth
, thanks to the addition of TermLiteralSNat
.
I suppose your confusion is that you thought I referred to the problematic formulation of:
data DcConfig' depth = DcConfig' { dcDepth' :: !(SNat depth), ... }
which unfortunately doesn't work?
Oh yes I see, thank you! I have just pushed the full changelog.
@vmchale The current commit message isn't valuable for future code archeology. A good commit body should mention:
- What change was introduced
- Why the change was introduced
- Explain shortcomings / future work statements (although more often than not this is what inline
-- TODO
comments are for)
These aren't strict rules - sometimes reasons are completely obvious (like adding support for a Xilinx IP). Sometimes the change is fully explained in the code diff itself (do not assume code is self-explanatory though!). Sometimes there's no future work the author can think of.
Finally, in the footer it should refer to:
- Any issues (if applicable)
- Any co-authors (if applicable)
Ah I'm so sorry I didn't realize the Co-authored-by
lines had to be at the end of the commit 😬
The behavior of full/empty signals in the Haskell model is not tested.
What do you mean? It's in the test bench, right? I have to admit those are only run automatically in Vivado, which is a shortcoming we should one day fix, but I did run them manually in Haskell as well. Do you mean the test bench doesn't check cross-domain timing? That reminds me. The Haddock mentions
Note that the behavior of the FIFO in Haskell simulation does not correspond exactly to the behavior in HDL simulation, and that neither behaviors correspond exactly to hardware. All the different behaviors are functionally correct and equivalent when the usage guidelines in the product guide are observed.
and the original PR cover letter said
We don't care too much about having the Haskell model correspond 1:1 to the RTL for our client's project. Please de-prioritize.
I think a message to this effect should be in the commit message.