circt [ExportVerliog] Add a `sv.hint.emit_as

In some cases we would like to emit array_get operations in SV with additional vendor pragmas, like cadence map_to_mux and synopsys infer_mux_override. This comes up in the FIRRTL dialect when lowering the MultibitMuxOp.

The current solution requires LowerToHW to not only generate the array_get operation, but also introduce an sv.wire and sv.assign with the corresponding SV attributes configured. What bothers me about this is that it skips the HW dialect level and makes the FIRRTL dialect talk about SV things, which precludes us from reasoning about the design at an HW level without any knowledge of SV. Furthermore, it seems useful for other dialects to be able to say "decorate this array_get such that it gets inferred as a mux by downstream tools", instead of resorting to mixing HW with SV themselves.

This PR changes FIRRTL's LowerToHW to just lower MultibitMuxOp to the array_get op, but with an additional sv.hint.emit_as_mux unit attribute. This attribute is then picked up by ExportVerilog's emission preparation, which inserts the appropriate SV IR nodes to get the temporary wire with the desired pragma decorations.

As a result, dialects above HW don't need to talk about the SV AST through the SV dialect just to achieve an optimization hint in the SV output, and the SV dialect becomes responsible for cooking up the necessary pragmas and decorations to honor the hint.

Sep 09 '22 01:09 fabianschuiki

This feels like a hack.

We've always discussed expanding the MuxOp to have >2 inputs, but always said "if we find a reason, we'll do it". Ensuring that downstream tools infer the right thing seems to me to be a compelling reason. I think that would be the better solution.

Sep 09 '22 03:09 teqdruid

I agree we don't want to introduce pragmas specific to SV AST at LowerToHW so delaying the lowering makes sense to me.

We've always discussed expanding the MuxOp to have >2 inputs, but always said "if we find a reason, we'll do it". Ensuring that downstream tools infer the right thing seems to me to be a compelling reason.

I think extending mux op to have >2 inputs would solve the issue but we will eventually want to annotate same pragmas to array indexing for arrays on ports. I think it would be difficult to apply >2 inputs mux op for array ports because we have to unwrap array elements.

Also regarding multibit mux, I've wondered if we could have good mechanism to control multibit mux emission styles more flexibly. Actually we have wanted to try different flavors of mux like 2:1 mux and 4:1 mux as well. So maybe we can introduce sv.hint.array_indexing_style={plain, pragmas, 2to1, 4to1, ...} enum attribute and lower the array get after SV/HW/Comb canonicalizations.

One concern I have is ExportVerilog/Prepare might not be the right place to perform the lowering. Probably we want to have LowerSVHint pass and run it before PrettifyVerilog.

Sep 09 '22 13:09 uenoku

I think extending mux op to have >2 inputs would solve the issue but we will eventually want to annotate same pragmas to array indexing for arrays on ports. I think it would be difficult to apply >2 inputs mux op for array ports because we have to unwrap array elements.

This was also my thought behind this. I agree with you @teqdruid that we could get away with redefining the mux op, but in the long run we'll end up having lots of emission hints and tweaks that we probably can't all reconcile into the actual op definitions at once. We're also making an effort in HW/Comb/Seq to have one way of doing things (not vs xor), so it seems counterproductive to start catering to the emission minutia by expanding op definitions which will inevitably make them less orthogonal and overlap (the comb.mux would get very close to hw.array_get).

That makes me less of a fan of trying to get emission details into HW, and encoding emission hints as ODS attributes on HW, because the ODS attributes suggest that these hints are part of the operation semantics, when in fact they are not: when you transform hw.array_get, you don't need to care about the emission flavor which is just an optimization. But the semantics of the op don't change if you remove that attribute.

So what this is trying to avoid is having upstream dialects use SV ops directly to encode emission optimizations that don't change the behaviour of the result. But we also don't want to encode SV minutia in the HW op. And so as a result we could make upstream dialects talk about emission tweaks through optional hints which can all be removed without changing correctness. That way HW stays self-contained, and upstream dialects can have a clean lowering, with the option of hinting downstream dialects/conversions to do things a certain way.

If we ever start writing a simulator, we really don't want to deal with he AST represented by the SV dialect. Instead we'd want to simulate a design in the HW/Comb/Seq dialects. Which also means we'd have to counteract the "erosion" of HW that we end up doing to get the SV output to look a specific way.

Sep 09 '22 18:09 fabianschuiki

I think it would be difficult to apply >2 inputs mux op for array ports because we have to unwrap array elements

Good point. Though having an ArrayExplodeOp (just like we have for structs) would solve this issue.

We're also making an effort in HW/Comb/Seq to have one way of doing things (not vs xor), so it seems counterproductive to start catering to the emission minutia by expanding op definitions which will inevitably make them less orthogonal and overlap (the comb.mux would get very close to hw.array_get).

Agreed. But then why do we have MuxOp at all? Couldn't it just be ArrayCreateOp and ArrayGetOp? I see this as an argument for ditching one or the other, not against multi-bit muxes.

But we also don't want to encode SV minutia in the HW op

I agree about the dialect attributes being just "optimizations", so I take back what I said in terms of the hackiness. Regardless, in this specific case I think MuxOp is appropriate. Would we not want to have the same SV optimizations on MuxOp? And have them automatically implied?

Misc question: if an ASIC synthesis engine doesn't use a mux for an ArrayGetOp, what does it use? Does it essentially create a mux out of other standard cells? Does it implement a shifter instead? On an FPGA, I can only imagine the mess of LUTs which a synthesis engine could naively create if it doesn't detect a mux.

Sep 09 '22 19:09 teqdruid

Misc question: if an ASIC synthesis engine doesn't use a mux for an ArrayGetOp, what does it use? Does it essentially create a mux out of other standard cells? Does it implement a shifter instead? On an FPGA, I can only imagine the mess of LUTs which a synthesis engine could naively create if it doesn't detect a mux.

We have seen array indexing is lowered into AndOr trees and AOI gates so we wanted to annotate the pragmas to use mux cells.

Sep 13 '22 11:09 uenoku

In some recent discussions the idea of adding LLVM-style intrinsics to CIRCT came up. These might apply quite well here, since we want to express a fairly precise mapping to gates in the HW dialect here, but we also don't want to overload comb.mux with a lot of second-order concerns and stuff it full of attribute. So one thing we could do is to introduce something like a hw.intrinsic.mux4 or hw.intrinsic.mux. I'm wondering if that would be a viable alternative here. Or if we should land this and then try to switch to intrinsics later.

Sep 13 '22 15:09 fabianschuiki

Said intrinsics will necessarily be tied to a particular technology, so the names will need to be slightly more specific than 'hw.intrinsic.mux`. One thing that we've discussed in the past is the idea of one or more dialects of device primitives: for ASIC std cell libs, all of the std cells. For FPGA, all of the device primitives for Xilinx devices. etc.

On Tue, Sep 13, 2022 at 8:55 AM Fabian Schuiki @.***> wrote:

In some recent discussions the idea of adding LLVM-style intrinsics to CIRCT came up. These might apply quite well here, since we want to express a fairly precise mapping to gates in the HW dialect here, but we also don't want to overload comb.mux with a lot of second-order concerns and stuff it full of attribute. So one thing we could do is to introduce something like a hw.intrinsic.mux4 or hw.intrinsic.mux.

— Reply to this email directly, view it on GitHub https://github.com/llvm/circt/pull/3843#issuecomment-1245612501, or unsubscribe https://github.com/notifications/unsubscribe-auth/AALNXYD6H2PNSHLCBFRL6MTV6CPVHANCNFSM6AAAAAAQIIIGEE . You are receiving this because you were mentioned.Message ID: @.***>

Oct 11 '22 08:10 teqdruid

We have since gained the sv.attributes attribute on ops which allow us to communicate these pragmas in a uniform way. Closing this.

Mar 01 '23 23:03 fabianschuiki

[ExportVerliog] Add a `sv.hint.emit_as_mux` attribute