blarney
blarney copied to clipboard
Haskell library for hardware description
:toc: macro :toclevels: 4 :toc-title: :toc-placement!: :source-highlighter:
image::blarney-logo.svg#gh-light-mode-only[Blarney logo, width=275] image::blarney-logo-dark.svg#gh-dark-mode-only[Blarney logo, width=275]
Blarney is a Haskell library for hardware description that builds a range of HDL abstractions on top of a small set of pure functional circuit primitives. It is a modern variant of http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.110.5587&rep=rep1&type=pdf[Lava] with Verilog and SMT backends. Some aspects of the library are also inspired by https://github.com/B-Lang-org/bsc[Bluespec], such as first-class actions and method-based interfaces. Applications of Blarney include https://github.com/CTSRD-CHERI/SIMTight[SIMTight] (a CHERI-enabled RISC-V GPGPU) and https://github.com/blarney-lang/actora[Actora] (a stack machine with a compiler for an Erlang-like language). Below, we introduce the library by example, supplementing the http://blarney-lang.github.io/blarney/index.html[Haddock docs].
[discrete] == Contents
toc::[]
:sectnums:
== Prerequisites
We'll need Verilator and GHC 9.2.1 or later.
On Ubuntu 20.04, we can do:
[source, shell]
$ sudo apt install verilator
For GHC 9.2.1 or later, https://www.haskell.org/ghcup/[ghcup] can be used.
== Quick start
Recursively clone the repo and set the BLARNEY_ROOT environment
variable to point to it, and add the Scripts directory to your
PATH:
[source, shell]
$ git clone --recursive https://github.com/blarney-lang/blarney $ export BLARNEY_ROOT=$(pwd)/blarney $ export PATH=$PATH:$BLARNEY_ROOT/Scripts
To run an example using Blarney's in-Haskell simulator:
[source, shell]
$ cd Examples/Sorter $ make $ ./Sorter --simulate
You should see the output:
sort [3,4,1,0,2] = [0,1,2,3,4]
To generate Verilog for an example, and then simulate the Verilog
using Verilator, omit the --simulate option and then run the
generated makefile:
[source, shell]
$ ./Sorter # Run the Verilog generator $ cd Sorter-Verilog $ make # Compile the generated Verilog using Verilator $ ./Sorter # Simulate the generated Verilog
You should see the same output as before. Using Verilator is the recommended approach for simulating Blarney designs.
To run the regression test suite:
[source, shell]
$ cd Test $ ./test.sh --run-all
== Blarney by example
=== Two-sort
Sorting makes for a good introduction to the library. Let's start
with the simplest kind of sorter possible: given a pair of 8-bit
values, the function twoSort returns the sorted pair.
[source, haskell]
import Blarney
twoSort :: (Bit 8, Bit 8) -> (Bit 8, Bit 8) twoSort (a, b) = a .<. b ? ((a, b), (b, a))
This definition makes use of three Blarney constructs: the Bit type
for bit vectors (parametised by the size of the vector); the
comparison operator .<.; and the ternary conditional operator ?.
A quick test bench to check that it works:
[source, haskell]
top :: Module () top = always do display "twoSort (1,2) = " (twoSort (1,2)) display "twoSort (2,1) = " (twoSort (2,1)) finish
We use Blarney's always construct to perform the given action on
every clock cycle. Blarney actions include statements for displaying
values during simulation (display), terminating the simulator
(finish), and mutating state (see below). All statements in an
Action execute in parallel, within a single cycle of an implicit
clock. We can generate Verilog for the test bench as follows.
[source, haskell]
main :: IO () main = writeVerilogTop top "top" "/tmp/twoSort/"
Assuming the above code is in a file named Sorter.hs, it can be
compiled at the command-line using
[source, shell]
$ blc Sorter.hs
where blc stands for Blarney compiler. This is just a script
(from Blarney's Scripts directory) that invokes GHC with the
appropriate compiler flags. Running the resulting executable
./Sorter will produce Verilog in the /tmp/twoSort directory,
including a makefile to build a Verilator simulator. The simulator
can be built and run as follows.
[source, shell]
$ cd /tmp/twoSort $ make $ ./top twoSort (1,2) = (1,2) twoSort (2,1) = (1,2)
=== In-Haskell simulation
Sometimes it can be convenient to skip Verilog generation, and use the in-Haskell simulator.
[source, haskell]
main :: IO () main = simulate top
Now after running ./Sorter we see the test bench output directly.
[source, shell]
$ ./Sorter twoSort (1,2) = (1,2) twoSort (2,1) = (1,2)
The in-Haskell simulator is much slower than Verilator, but can be more convenient for small designs. It is currently an experimental feature.
=== Bubble sort
We can build a general N-element sorter by connecting together
multiple two-sorters. One of the simplest ways to do this is the
bubble sort network. The key component is a function bubble that
takes a list of inputs and returns a new list in which the smallest
element comes first.
[source, haskell]
bubble :: [Bit 8] -> [Bit 8] bubble [] = [] bubble [x] = [x] bubble (x:y:rest) = bubble (small:rest) ++ [large] where (small, large) = twoSort (x, y)
If we repeatedly call bubble then we end up with a sorted list.
[source, haskell]
sort :: [Bit 8] -> [Bit 8] sort [] = [] sort xs = smallest : sort rest where smallest:rest = bubble xs
Running the test bench
[source, haskell]
top :: Module () top = always do let inputs = [3, 4, 1, 0, 2] display "sort " inputs " = " (sort inputs) finish
in simulation yields:
sort [3,4,1,0,2] = [0,1,2,3,4]
To see that the sort function really is describing a circuit, let's
draw the circuit digram for a 5-element bubble sorter.
-->.
|
-->+---.
| |
Inputs -->+---+---. | | | -->+---+---+---. | | | | -->+---+---+---+---. | | | | | v v v v v
Outputs
The input list is supplied on the left, and the sorted output list is
produced at the bottom. Each + denotes a two-sorter that takes
inputs from the top and the left, and produces the smaller value to
the bottom and the larger value to the right. See
https://pdfs.semanticscholar.org/de30/22efc5aec833d7b52bd4770a382fea729bba.pdf[The
design and verification of a sorter core] for a more in-depth
exploration of sorting circuits in Haskell.
=== Polymorphism
For simplicity, we've made our sorter specific to lists of 8-bit values. But if we look at the types of the primitive functions it uses, we can see that it actually has a more general type.
[source, haskell]
(.<.) :: Cmp a => a -> a -> Bit 1 (?) :: Bits a => Bit 1 -> (a, a) -> a
So .<. can be used on any type in the
http://blarney-lang.github.io/blarney/Blarney-Core-Bit.html#t:Cmp[Cmp] (comparator)
class. Similarly, ? can be used on any type in the <<Bits>>
class (which allows packing to a bit vector and back again). So a more generic
definition of twoSort would be:
[source, haskell]
twoSort :: (Bits a, Cmp a) => (a, a) -> (a, a) twoSort (a, b) = a .<. b ? ((a, b), (b, a))
Indeed, this would be the type inferred by the Haskell compiler if no type signature was supplied. Using Haskell's rebindable syntax, we can also use an if-then-else expression instead of the ternary conditional operator:
[source, haskell]
twoSort :: (Bits a, Cmp a) => (a, a) -> (a, a) twoSort (a, b) = if a .<. b then (a, b) else (b, a)
=== Mutable registers
So far, we've only seen display and finish actions inside a Blarney module.
Also supported are creation and assignment of registers. To illustrate, here
is a module that creates a 4-bit cycleCount register, increments it on each
cycle, stopping when it reaches 10.
[source, haskell]
top :: Module () top = do -- Create a register cycleCount :: Reg (Bit 4) <- makeReg 0
always do -- Increment on every cycle cycleCount <== cycleCount.val + 1
-- Display value on every cycle
display "cycleCount = " cycleCount.val
-- Terminate simulation when count reaches 10
when (cycleCount.val .==. 10) do
display "Finished"
finish
This example introduces a number of new library functions: makeReg
creates a register, initialised to the given value; the val field
yeilds the current value of the register; and when allows
conditional actions to be introduced. We can use if-then-else in an
Action context. For example, the final three lines above could have
been written as:
[source, haskell]
-- Terminate simulation when count reaches 10 if cycleCount.val .==. 10 then do display "Finished" finish else display "Not finished"
Running top in simulation gives
cycleCount = 0 cycleCount = 1 cycleCount = 2 cycleCount = 3 cycleCount = 4 cycleCount = 5 cycleCount = 6 cycleCount = 7 cycleCount = 8 cycleCount = 9 cycleCount = 10 Finished
=== Queues
Queues (also known as FIFOs) are a commonly used abstraction in hardware
design. Blarney provides http://blarney-lang.github.io/blarney/Blarney-Queue.html[a
range of different queue implementations], all of which implement the following
interface available when importing Blarney.Queue.
[source, haskell]
-- Queue interface data Queue a = Queue { notEmpty :: Bit 1 -- Is the queue non-empty? , notFull :: Bit 1 -- Is there any space in the queue? , enq :: a -> Action () -- Insert an element (assuming notFull) , deq :: Action () -- Remove the first element (assuming canDeq) , canDeq :: Bit 1 -- Guard on the deq and first methods , first :: a -- View the first element (assuming canDeq) }
The type Queue a represents a queue holding elements of type a, and
provides a range of standard functions on queues. The enq method should only
be called when notFull is true and the deq method should only be called
when canDeq is true. Similarly, the first element of the queue is only
valid when canDeq is true. Below, we present the simplest possible
implementation of a one-element queue.
[source, haskell]
import Blarney.Queue
-- Simple one-element queue implementation makeSimpleQueue :: Bits a => Module (Queue a) makeSimpleQueue = do -- Register holding the one element reg :: Reg a <- makeReg dontCare
-- Register defining whether or not queue is full full :: Reg (Bit 1) <- makeReg 0
-- Methods return Queue { notFull = full.val .==. 0 , notEmpty = full.val .==. 1 , enq = \a -> do reg <== a full <== 1 , deq = full <== 0 , canDeq = full.val .==. 1 , first = reg.val }
The following simple test bench illustrates how to use a queue.
[source, haskell]
-- Small test bench for queues top :: Module () top = do -- Instantiate a queue of 8-bit values queue :: Queue (Bit 8) <- makeSimpleQueue
-- Create an 8-bit count register count :: Reg (Bit 8) <- makeReg 0
always do count <== count.val + 1
-- Writer side
when queue.notFull do
queue.enq count.val
display "Enqueued " count.val
-- Reader side
when queue.canDeq do
queue.deq
display "Dequeued " queue.first
-- Terminate after 100 cycles
when (count.val .==. 100) finish
=== Mutable wires
Wires are a feature of the Action monad that offer a way for separate
action blocks to communicate within the same clock cycle. Whereas assignment
to a register becomes visible on the clock cycle after the assigment occurs,
assignment to a wire is visible on the same cycle as the assignment. If no
assignment is made to a wire on a particular cycle, then the wire emits its
default value on that cycle. When multiple assignments to the same wire
occur on the same cycle, the wire emits the bitwise disjunction of all the
assigned values.
To illustrate, let's implement an n-bit counter module that supports increment and decrement operations.
[source, haskell]
-- Interface for a n-bit counter data Counter n = Counter { inc :: Action () , dec :: Action () , output :: Bit n }
We'd like the counter to support parallel calls to inc and dec. That is,
if inc and dec are called on the same cycle then the counter's output is
unchanged. We'll achieve this using wires.
[source, haskell]
makeCounter :: KnownNat n => Module (Counter n) makeCounter = do -- State count :: Reg (Bit n) <- makeReg 0
-- Wires incWire :: Wire (Bit 1) <- makeWire 0 decWire :: Wire (Bit 1) <- makeWire 0
always do -- Increment when (incWire.val .&&. inv decWire.val) do count <== count.val + 1
-- Decrement
when (inv incWire.val .&&. decWire.val) do
count <== count.val - 1
-- Interface return Counter { inc = do incWire <== 1 dec = do decWire <== 1 output = count.val }
=== Recipes
State machines are a common way of defining the control-path of a circuit. They are typically expressed by doing case-analysis of the current state and manually setting the next state. Quite often however, they can be expressed more neatly in a http://blarney-lang.github.io/blarney/Blarney-Recipe.html[Recipe] -- a simple imperative language with various control-flow constructs.
[source, haskell]
data Recipe = Skip -- Do nothing (in zero cycles) | Tick -- Do nothing (in one cycle) | Action (Action ()) -- Perform action (in one cycle) | Seq [Recipe] -- Execute recipes in sequence | Par [Recipe] -- Fork-join parallelism | Wait (Bit 1) -- Block until condition holds | When (Bit 1) Recipe -- Conditional recipe | If (Bit 1) Recipe Recipe -- If-then-else recipe | While (Bit 1) Recipe -- Loop | Background Recipe -- Run recipe in background
To illustrate, here is a small state machine that computes the factorial of 10.
[source, haskell]
fact :: Module () fact = do -- State n :: Reg (Bit 32) <- makeReg 0 acc :: Reg (Bit 32) <- makeReg 1
-- Compute factorial of 10 let recipe = Seq [ Action do n <== 10 , While (n.val .>. 0) ( Action do n <== n.val - 1 acc <== acc.val * n.val ) , Action do display "fact(10) = " acc.val finish ]
runRecipe recipe
Blarney provides a lightweight compiler for the Recipe language (under 100
lines of code), which we invoke above through the call to runRecipe.
A very common use of recipes is to define test sequences. For example, here is
a simple test sequence for the Counter module defined earlier.
[source, haskell]
-- Test-bench for a counter top :: Module () top = do -- Instantiate an 4-bit counter counter :: Counter 4 <- makeCounter
-- Sample test sequence let test = Seq [ Action do counter.inc , Action do counter.inc , Action do counter.inc counter.dec , Action do display "counter = " counter.output finish ]
runRecipe test
Here, we increment counter on the first cycle, and then again on the second.
On the third cycle, we both increment and decrement it in parallel. On the
fourth cycle, we display the value and terminate the simulator.
=== Statements
For convenience, recipes can also be constucted using do notation. The
http://blarney-lang.github.io/blarney/Blarney-Stmt.html[Stmt] monad is simply a
wrapper around Recipe, which defines monadic bind as sequential composition.
It is entirely syntatic sugar, providing no new functionality.
To illustrate, here's the factorial example from earlier, rewritten using the
Stmt monad.
[source, haskell]
fact :: Module () fact = do -- State n :: Reg (Bit 32) <- makeReg 0 acc :: Reg (Bit 32) <- makeReg 1
-- Compute factorial of 10 let stmt = do action do n <== 10 while (n.val .>. 0) do action do n <== n.val - 1 acc <== acc.val * n.val action do display "fact(10) = " acc.val finish
runStmt stmt
We have found that some users prefer Recipe syntax, while others prefer
Stmt syntax, so we offer both.
=== Block RAMs
Blarney provides http://blarney-lang.github.io/blarney/Blarney-Core-RAM.html[a variety of block RAM modules] commonly supported on FPGAs. They are all based around the following interface.
[source, haskell]
-- Block RAM interface -- (Parameterised by the address width a and the data width d) data RAM a d = RAM { load :: a -> Action () , store :: a -> d -> Action () , out :: d }
When a load is issued for a given address, the value at that address appears
on out on the next clock cycle. When a store is issued, the value is
written to the RAM on the current cycle, and a load of the new value can be
requested on the subsequent cycle. A parallel load and store should only
be issued on the same cycle if the RAM has been created as a dual-port RAM (as
opposed to a single-port RAM). To illustrate, here is a test bench that
creates a single-port block RAM and performs a store followed by a load.
[source, haskell]
top :: Module () top = do -- Instantiate a 256 element RAM of 5-bit values ram :: RAM (Bit 8) (Bit 5) <- makeRAM
-- Write 10 to ram[0] and read it back again runStmt do action do store ram 0 10 action do load ram 0 action do display "Got " ram.out finish
Somewhat-related to block RAMs are http://blarney-lang.github.io/blarney/Blarney-Core-Module.html#t:RegFile[register files]. The difference is that a register file allows the value at an address to be determined within a clock cycle. It also allows any number of reads and writes to be performed within the same cycle. Register files have the following interface.
[source, haskell]
data RegFile a d = RegFile { index :: a -> d -- Read , update :: a -> d -> Action() -- Write }
To read from a register file, use the index method or the generic lookup
operator !. Unlike block RAMs, register files (especially large ones) do not
always map efficiently onto hardware, so use with care!
=== Sources, sinks, and streams
[#sources-sinks-streams]
Sources and sinks are commonly-used flow-control abstractions in hardware description. They are often used to implement hardware modules that produce or consume data at a variable rate, depending on internal details of the module that the implementer does not wish to (or is unable to) expose. In Blarney, http://blarney-lang.github.io/blarney/Blarney-SourceSink.html[sources and sinks] are captured by the following interfaces.
[source, haskell]
-- Data is consumed from a source data Source t = Source { -- The next value being produced by the source peek :: t -- Invoke this action to consume the next value , consume :: Action () -- Can the source currently be peeked or consumed? , canPeek :: Bit 1 }
-- Data is injected into a sink data Sink t = Sink { -- Can a value be injected into the sink? canPut :: Bit 1 -- Inject the given value into the sink , put :: t -> Action () }
-- A stream is another name for a source (discussed below) type Stream t = Source t
A queue is both a source and a sink.
[source, haskell]
-- Convert a queue to a source instance ToSource (Queue t) t where toSource :: Queue t -> Source t toSource q = Source { canPeek = q.canDeq , peek = q.first , consume = q.deq }
-- Convert a queue to a sink instance ToSink (Queue t) t where toSink :: Queue t -> Sink t toSink q = Sink { canPut = q.notFull , put = q.enq }
-- Another name for toSource (discussed below) toStream :: ToSource a b => a -> Stream b toStream = toSource
Sources and sinks can be https://blarney-lang.github.io/blarney/Blarney-Connectable.html[connected together].
Note that taking a sink as a function argument (input) is very similar
to returning a source as a function result (output). Both allow the
function to produce data at a variable rate. Is it therefore
redundant to provide both Source and Sink? Not quite. When a
function takes a sink as input, it knows when the caller is ready to
consume before producing data; when a function returns a source as
output, it knows when the caller does consume after producing data.
This subtle difference can be important when the programmer wants
minimise buffering and latency between producer and consumer. Often
though, we don't mind buffering (it's good for Fmax) so our convention
is to use the Stream type in most circumstances.
As an example, here's a function that increments each value in an input stream to produce an output stream.
[source, haskell]
inc :: Stream (Bit 8) -> Module (Stream (Bit 8)) inc xs = do -- Output buffer buffer <- makeQueue
always do -- Incrementer when (xs.canPeek .&&. buffer.notFull) do xs.consume buffer.enq (xs.peek + 1)
-- Convert buffer to a stream return (toStream buffer)
=== Modular compilation
So far we've seen examples of top-level modules, i.e. modules with no inputs or
outputs, being converted to Verilog. In fact, any Blarney function whose
inputs and outputs are members of the
http://blarney-lang.github.io/blarney/Blarney-Core-Interface.html[Interface] class can
be converted to Verilog (and the Interface class supports generic deriving).
To illustrate, we can convert the function inc (defined
<<sources-sinks-streams, above>>) into
a Verilog module as follows.
[source, haskell]
main :: IO () main = writeVerilogModule inc "inc" "/tmp/inc"
The generated Verilog module /tmp/inc/inc.v has the following
interface:
[source, systemverilog]
module inc( input wire clock , input wire reset , output wire [0:0] in0_consume_en , input wire [0:0] in0_canPeek , input wire [7:0] in0_peek , input wire [0:0] out_consume_en , output wire [7:0] out_peek , output wire [0:0] out_canPeek );
Considering the definition of the Stream type, the correspondance between the
Blarney and the Verilog is quite clear:
[cols="1,3", options="header"] |=== |Signal |Description
|in0_consume_en
|Output asserted whenever the module consumes an element from the input stream.
|in0_canPeek
|Input signalling when there is data available in the input stream.
|in0_peek
|Input containing the next value in the input stream.
|out_canPeek
|Output asserted whenever there is data available in the output stream.
|out_peek
|Output containing the next value in the output stream.
|out_consume_en
|Input signalling when the caller consumes an element from the output stream.
|===
It is also possible to instantiate a Verilog module inside a Blarney
description. To illustrate, here is a function that creates an instance of the
Verilog inc module shown above.
[source, haskell]
-- This function creates an instance of a Verilog module called "inc" makeInc :: Stream (Bit 8) -> Module (Stream (Bit 8)) makeInc = makeInstance "inc"
Notice that interface of the Verilog module being instantiated is determined
from the type signature. Here's a sample top-level module that uses the
makeInc function:
[source, haskell]
top :: Module () top = do -- Counter count :: Reg (Bit 8) <- makeReg 0
-- Input buffer buffer <- makeQueue
-- Create an instance of inc out <- makeInc (toStream buffer)
always do -- Fill input when buffer.notFull do buffer.enq count.val count <== count.val + 1
-- Consume
when out.canPeek do
out.consume
display "Got " out.peek
when (out.peek .==. 100) finish
Using the following main function we can generate both the inc module and a
top-level module that instantiates it.
[source, haskell]
main :: IO () main = do let dir = "/tmp/inc" writeVerilogModule inc "inc" dir writeVerilogTop top "top" dir
Using this approach, we can maintain the module hierarchy of a Blarney design whenever we generate Verilog, rather than having to flatten it to big monolithic netlist. This technique can also be used to instantiate any Verilog module within a Blarney design.
When simply marking netlist boundaries within a Blarney design, the
makeInstance/writeVerilogModule combination is rather low-level
and error-prone. In particular, there is no requirement for the type
of the instance to match the type of the module, and it would be nice
to specify a boundary in a backend-independent way. To solve these
problems, Blarney provides a makeBoundary function. We can now
define makeInc as:
[source, haskell]
makeInc :: Stream (Bit 8) -> Module (Stream (Bit 8)) makeInc = makeBoundary "inc" inc
Unlike makeInstance, makeBoundary takes the module to instantiate
as an argument. The type of the argument to makeBoundary must match
the return type:
[source, haskell]
makeBoundary :: Modular m => String -> m -> m
This means that it is unncessary to supply a type signature for
makeInc now; it will be inferred. Furthermore, the top-level of our
design no longer needs to call writeVerilogModule for the inc
module because Blarney now knows how to generate a module for any
instance that it encounters.
=== Master-slave pattern
This is a common pattern in hardware design. Suppose we wish to move a multiplier out of a module and into an separate slave module, where the slave takes requests (pairs of 32-bit integers to multiply) and produces responses (32-bit results).
[source, haskell]
type MulReq = (Bit 32, Bit 32) type MulResp = Bit 32
The slave component might be defined as:
[source, haskell]
slave :: Stream MulReq -> Module (Stream MulResp) slave reqs = do resps <- makeQueue
always do when (reqs.canPeek .&&. resps.notFull) do reqs.consume let (a, b) = reqs.peek resps.enq (a * b)
return (toStream resps)
The master component produces requests for the slave, and consumes responses from the slave. In the example below, the master simply asks the slave to multiply 2 by 2, waits for the response, and then terminates the simulation.
[source, haskell]
master :: Stream MulResp -> Module (Stream MulReq) master resps = do reqs <- makeQueue
runStmt do wait reqs.notFull action do reqs.enq (2, 2) wait resps.canPeek action do resps.consume display "Result: " resps.peek finish
return (toStream reqs)
The top-level module which connects the master and the slave needs to introduce
a cycle, which can be achieved simply using Haskell's recursive-do (mdo)
notation:
[source, haskell]
top :: Module () top = mdo resps <- slave reqs reqs <- master resps return ()
=== Tagged unions
[#tagged-unions]
Sum types such as
[source, haskell]
data Either a b = Left a | Right b
do not permit generic deriving for the Bits class, so cannot be used
for circuit-time values. (An elaboration-time value cannot be
influenced by a circuit-time value, making the definition of unpack
problematic for sum types, at least without resorting to language
plugins). However, Blarney does support tagged unions, allowing the
following definition.
[source, haskell]
import Blarney.TaggedUnion
type Either a b = TaggedUnion [ "left" ::: a , "right" ::: b ]
The API for tagged unions is illustrated by the sample functions below.
[source, haskell]
makeLeft :: Bits a => a -> Either a b makeLeft x = tag #left x
isLeft :: Either a b -> Bit 1
isLeft x = x is #left
isRight :: Either a b -> Bit 1
isRight x = x is #right
getLeft :: Bits a => Either a b -> a getLeft x = untag #left x
getLeftOrZero :: Bits a => Either a b -> a getLeftOrZero x = untagDefault #left zero x
exampleAction :: Action () exampleAction = do let foo :: Either (Bit 2) (Bit 4) = tag #right 15 whenTagged #right foo \r -> do display "Right val: " r
=== Bit selection and lookup
Bit selection operators are used to extract a subset of bits out of a bit-vector. There are different flavours, depending on whether the indices are type-level numbers, elaboration-time numbers, or circuit-level numbers.
For type-level indices, we provide functions http://blarney-lang.github.io/blarney/Blarney-Core-Bit.html#v:at[at] and http://blarney-lang.github.io/blarney/Blarney-Core-Bit.html#v:slice[slice], and use type application to specify the type-level indices:
[source, haskell]
-- Extract most-sigificant bit of a byte msb :: Bit 8 -> Bit 1 msb x = at @7 x
-- Extract upper 4 bits of a byte upperNibble :: Bit 8 -> Bit 4 upperNibble x = slice @7 @4 x
For elaboration-time indices of type Int, we provide
http://blarney-lang.github.io/blarney/Blarney-Core-Bit.html#v:unsafeAt[unsafeAt] and
http://blarney-lang.github.io/blarney/Blarney-Core-Bit.html#v:unsafeSlice[unsafeSlice]:
[source, haskell]
-- Extract most-sigificant bit of a byte msb :: Bit 8 -> Bit 1 msb x = unsafeAt 7 x
-- Extract upper 4 bits of a byte upperNibble :: Bit 8 -> Bit 4 upperNibble x = unsafeSlice (7, 4) x
The argument to unsafeAt could be out of range, and the result of
unsafeSlice could have a different width to that implied by the range. Such
cases will lead to confusing error messages, hence the "unsafe" prefix on the
function names.
Finally, for circuit-level indicies of type Bit n, the generic lookup
operator ! can be used:
[source, haskell]
-- Extract bit from byte at given index getBit :: Bit 8 -> Bit 3 -> Bit 1 getBit x i = x!i
Blarney's generic lookup operator x!i returns the element of x at
index i, and works for many different types of x and i. See
<<Lookup>> for more details.
=== Bit-string pattern matching
Recent work on specifying and implementing ISAs led us to develop two libraries
for doing bit-string pattern matching. The first,
http://blarney-lang.github.io/blarney/Blarney-BitPat.html[BitPat], is statically-typed
and based on the paper https://core.ac.uk/download/pdf/50525461.pdf[Type-safe
pattern combinators]. The second,
http://blarney-lang.github.io/blarney/Blarney-BitScan.html[BitScan], is dynamically
typed but more expressive. As an example, BitScan, let's us define the
following instruction decoder for a tiny subset of RISC-V.
[source, haskell]
import Blarney.BitScan
-- Semantics of add instruction add :: Bit 5 -> Bit 5 -> Bit 5 -> Action () add rs2 rs1 rd = display "add r" rd ", r" rs1 ", r" rs2
-- Semantics of addi instruction addi :: Bit 12 -> Bit 5 -> Bit 5 -> Action () addi imm rs1 rd = display "addi r" rd ", r" rs1 ", " imm
-- Semantics of store instruciton sw :: Bit 12 -> Bit 5 -> Bit 5 -> Action () sw imm rs2 rs1 = display "sw r" rs2 ", " imm "(r" rs1 ")"
top :: Module () top = always do -- Sample RISC-V store-word instruction let instr :: Bit 32 = 0b1000000_00001_00010_010_00001_0100011
-- Dispatch match instr [ "0000000 rs2[4:0] rs1[4:0] 000 rd[4:0] 0110011" ==> add, " imm[11:0] rs1[4:0] 000 rd[4:0] 0010011" ==> addi, "imm[11:5] rs2[4:0] rs1[4:0] 010 imm[4:0] 0100011" ==> sw ]
finish
The nice thing about this decoder is that the scattered immediate field imm
in the sw instruction is automatically assembled by the library. That is,
the imm[11:5] part of the immediate is combined with the imm[4:0] part to
give the final 12-bit immediate value passed to the right-hand-side function.
Scattered immediates appear a lot in the RISC-V specification. Thanks to Jon
Woodruff for suggesting this feature!
For a fuller example of the BitScan module, see the Pebbles RV32I
https://github.com/blarney-lang/pebbles/blob/master/src/Pebbles/Instructions/RV32_I.hs[instruction decoder].
=== CPUs
A few processor cores have been implemented in Blarney:
- https://github.com/blarney-lang/blarney/blob/master/Examples/CPU/CPU.hs[Simple]: 4-stage 8-bit CPU, with just 4 instructions, for learning.
- https://github.com/blarney-lang/pebbles/[Pebbles]: RISC-V CPU+GPU using plugable pipelines.
- https://github.com/blarney-lang/actora/[Actora]: 3-stage stack machine that runs code written a subset of Erlang.
=== Namer plugin
One of the classic limitations of Lava is that identifier names are lost when the netlist is generated. In particular, this is problematic when we want to analyse, say, the critical-path of our circuit using a third-party tool, but there is no way to map the netlist names reported by the tool back to the Lava names in the original description.
Blarney provides a solution to this problem in the form of the https://github.com/blarney-lang/blarney/blob/master/Haskell/BlarneyPlugins/Namer[Namer plugin]. This is a simple GHC plugin (around 150 lines of code) that looks for monadic bindings of the form
[source, haskell]
x <- m
where m has type Module a for any a, and automatically rewrites the
binding as
[source, haskell]
x <- withName "x" m
where
http://blarney-lang.github.io/blarney/Blarney-Core-Module.html#v:withName[withName] is
a Blarney primitive that introduces name information inside m This simple
approach captures quite a lot of useful names.
The plugin is completely optional, and disabled by default. To enable it, first install using cabal
[source, shell]
cd Haskell/BlarneyPlugins/Namer cabal v1-install
and then pass the --enable-namer-plugin flag to blc.
To further improve the readability of generated code, we can also pass the
--enable-name-prop and --enable-simplifier options to our circuit
generator. This will enable the (experimental) name propagation and netlist
simplification passes respectively.
== Type class overview
=== Bits
Any type in the http://blarney-lang.github.io/blarney/Blarney-Core-Bits.html[Bits] class can be represented in hardware, e.g. stored in a wire, a register, or a RAM.
[source, haskell]
class Bits a where type SizeOf a :: Nat sizeOf :: a -> Int pack :: a -> Bit (SizeOf a) unpack :: Bit (SizeOf a) -> a
The Bits class supports generic deriving. For example, suppose we have a
simple data type for memory requests:
[source, haskell]
data MemReq = MemReq { memOp :: Bit 1 -- Is it a load or a store request? , memAddr :: Bit 32 -- 32-bit address , memData :: Bit 32 -- 32-bit data for stores } deriving (Generic, Bits)
To make this type a member of the Bits class, we have suffixed it with
derving (Generic, Bits). The generic deriving mechanism for Bits does not
support sum types: there is no way to convert a bit-vector (run-time circuit
value) to a sum type (elaboration-time value) using the circuit primitives
provided by Blarney (however, see <<tagged-unions, tagged unions>>).
=== Interface
Any type in the
http://blarney-lang.github.io/blarney/Blarney-Core-Interface.html[Interface] class can
be used as a module input or output when doing <<modular-compilation, modular
compilation>>. Furthermore, collections of interfaces can be indexed by
circuit-time values using the ! operator. To illustrate, here is an example
circuit to split a stream of <<bits, MemReq>> into four streams, using the
lower two bits of the address to decide which output stream to use.
[source, haskell]
split :: Stream MemReq -> Module [Stream MemReq] split reqs = do -- Create a list of 4 queues queues :: [Queue MemReq] <- replicateM 4 makeQueue
always do -- Consume request, and put into appropriate queue when reqs.canPeek do let i :: Bit 2 = truncate reqs.peek.memAddr when (queues!i).notFull do reqs.consume (queues!i).enq reqs.peek
return (map toStream queues)
The Interface class supports generic deriving: just add Interface to the
deriving clause for the datatype. In the above example, MemReq is an
Interface, and so too is Queue a for any a that is also an Interface.
=== Lookup
The generic lookup operator ! is provided by the
http://blarney-lang.github.io/blarney/Blarney-Core-Lookup.html[Lookup] class.
[source, haskell]
-- Index a collection 'c' of elements 'e' using index 'i' class Lookup c i e | c -> e where (!) :: c -> i -> e
A wide range of combinations of types are supported. The functional dependency
c -> e allows the return type to be inferred from the collection type.
=== FShow
Any value whose type is in the
http://blarney-lang.github.io/blarney/Blarney-Core-FShow.html[FShow] class, or any
value of type Format, can be passed as arguments to the variadic display
function.
[source, haskell]
class FShow a where fshow :: a -> Format fshowList :: [a] -> Format -- Has default definition
-- Abstract data type for things that can be displayed newtype Format
-- Format constructors mempty :: Format -- Empty (from Monoid class) (<>) :: Format -> Format -> Format -- Append (from Monoid class)
As an example, here is how the FShow instance for pairs is defined.
[source, haskell]
-- Example instance: displaying pairs instance (FShow a, FShow b) => FShow (a, b) where fshow (a, b) = fshow "(" <> fshow a <> fshow "," <> fshow b <> fshow ")"
The FShow class supports generic deriving.
The radix and padding used to display a bit vector can be specified using the following functions.
[source, haskell]
-- Display bit vector in binary with given amount of zero padding formatBin :: Int -> Bit n -> Format
-- Display bit vector in decimal with given amount of zero padding formatDec :: Int -> Bit n -> Format
-- Display bit vector in hex with given amount of zero padding formatHex :: Int -> Bit n -> Format
The FShow instance for Bit n uses decimal format with no padding.
=== Cmp
The Cmp (comparator) class provides a range of familiar comparison
operators, and supports generic deriving.
[source, Haskell]
class Cmp a where (.<.) :: a -> a -> Bit 1 (.<=.) :: a -> a -> Bit 1 (.==.) :: a -> a -> Bit 1 (.>.) :: a -> a -> Bit 1 (.>=.) :: a -> a -> Bit 1 (.!=.) :: a -> a -> Bit 1
Only the first three operators must be defined; the others have default definitions.
=== Assign
The assignment operator is overloaded.
[source, Haskell]
class Assign v where (<==) :: Bits a => v a -> a -> Action ()
Example instances are http://blarney-lang.github.io/blarney/Blarney-Core-Module.html#t:Reg[Reg], http://blarney-lang.github.io/blarney/Blarney-Core-Module.html#t:Wire[Wire], and http://blarney-lang.github.io/blarney/Blarney-Core-Module.html#t:WriteOnly[WriteOnly].
=== ToSource and ToSink
Converting interfaces to http://blarney-lang.github.io/blarney/Blarney-SourceSink.html[sources and sinks] may turn out to be common. For example, http://blarney-lang.github.io/blarney/Blarney-Queue.html[Queue] and http://blarney-lang.github.io/blarney/Blarney-Stack.html[Stack] are both sources and sinks. Therefore the following type classes are provided.
[source, Haskell]
-- Convert to a source class ToSource a b | a -> b where toSource :: a -> Source b
-- Convert to a sink class ToSink a b | a -> b where toSink :: a -> Sink b