calyx icon indicating copy to clipboard operation
calyx copied to clipboard

`tdcc`: Invalid code generated when `done` depends on input signal

Open rachitnigam opened this issue 4 years ago • 11 comments
trafficstars

To reproduce, run:

fud exec -s verilog.data tests/correctness/tcam/lpm.futil.data tests/correctness/tcam/lpm.futil --to dat -v -s futil.flags '-d static-timing'

The reason we haven't caught this before is because this doesn't occur with the static-timing pass performs the compilation. I have not reduced which pass causes this issue.

rachitnigam avatar Aug 02 '21 14:08 rachitnigam

The problem is the implementation of match_element:

component match_element(in: 32, prefix: 32, length: 5, ml_done: 1) -> (match_line: 1, write_en: 1) {
  cells { ... }
  wires {
    group compare<"static"=1> {
      ...
      compare[done] = ml_done;
    }
  }
  control {
    compare;
  }
}

The input signal ml_done is used to determine when the group is done (and transitively, when the component is done). The problem is that such a component interface cannot be correctly invoked. An invoke group looks like this:

cells { me0 = match_element();  }
group invoke0 {
  ...
  me0.ml_done = r.done;
  invoke0[done] = me0.done;
}

The group essentially says that "keep the connection me0.ml_done = r.done active as long as me0.done is not done" which recursively depends on that very assignment being active.

The reason the static-timing compilation works is because it does not use me0.done signal, instead relying on the FSM counter to count the appropriate number of cycles.

rachitnigam avatar Aug 06 '21 16:08 rachitnigam

The solution to fix issue is making it invalid for the done signal of groups and components to depend on an input signal.

Also, out of curiosity, @cgyurgyik is there a reason match_element needed this interface? An alternate interface would be one that would use an internal register to save the state of the match_element and connect the .out port of the register to an output port of the component so that the value can be read after the invoke statement.

rachitnigam avatar Aug 06 '21 16:08 rachitnigam

Also, FWIW, match_element is a classic example of a "combination components" that are not supported by Calyx right now (as pointed out by @sampsyo).

rachitnigam avatar Aug 06 '21 16:08 rachitnigam

The solution to fix issue is making it invalid for the done signal of groups and components to depend on an input signal.

Also, out of curiosity, @cgyurgyik is there a reason match_element needed this interface? An alternate interface would be one that would use an internal register to save the state of the match_element and connect the .out port of the register to an output port of the component so that the value can be read after the invoke statement.

I haven't had a moment to look more in-depth, but isn't this the same interface as the Calyx representation of memory, without an address port?

Edit: e.g.

https://github.com/cucapra/calyx/blob/40ec4e8d5acfad98d7b79ad1e280fa47ce591401/examples/futil/memory-by-reference/memory-by-reference.futil#L21-L29

cgyurgyik avatar Aug 06 '21 17:08 cgyurgyik

Yeah, I just implemented a fix for the problem in #624 and realized that this way of "passing registers/memories by reference" doesn't actually work at all.

rachitnigam avatar Aug 06 '21 17:08 rachitnigam

Following up on @rachitnigam's call for deeper thinking on this in https://github.com/cucapra/calyx/pull/624#issuecomment-894636352: this is an intriguing issue, but I don't fully understand the implications yet. To summarize what I think I understand:

  • The original motivation for doing something like this is that you want a component to be able to use memories "by reference," which really just means passing in all the signals to connect to that memory into the component so it can use that memory at will. Or more generally, a super-component would like to delegate the use of one of its subcomponents to a different subcomponent.
  • However, to actually use the "by-reference" component, the receiving subcomponent needs to have some of its groups signal done based on the "by-reference" component's done. It's just like using a component you own, except you're using those passed-in signals instead of the ports on a real, actual component name.
  • That strategy ends up creating a cycle on the subcomponent's done signal for reasons I can sorta-kinda see but can't really articulate having to do with the super-component's go insertion?
  • We could consider defining the problem away by saying that it's not allowed to use input ports for your done signals (#624 attempts to implement an automated check implementing that proposed change in policy), but this would rule out the aforementioned "by-reference" strategy entirely.
  • So we're left with two big questions:
    • Should done signals be allowed to depend on input ports? What are the implications of a decision either way? It seems like there are some nice encapsulation properties to be had, but it might be too restrictive.
    • If not, how will we accomplish this whole "by-reference"/delegation pattern? It seems super important to allow in some form, and if not with the aforementioned strategy, then how?

Anyway, seems like a synchronous conversation with at least me, @rachitnigam, and @EclecticGriffin is in order…

sampsyo avatar Aug 10 '21 14:08 sampsyo

That strategy ends up creating a cycle on the subcomponent's done signal for reasons I can sorta-kinda see but can't really articulate having to do with the super-component's go insertion?

It's annoying to articulate but basically two things need to be true for the cycle to happen:

  1. The only group in the component has its done condition passed in by-reference (so you have a control program: control { group }). This means that no FSM is instantiated. The lack of FSM is important because this is what's going to cause there to be a combinational loops with the done signal in the invoking component.
  2. static-timing is disabled. This is important because static-timing generates FSMs that don't use the done signal. By disabling the pass, we'll end up reading the done signal and creating a combinational loop.

This leaves us with two distinct problems:

  1. The component we defined, with one group, actually probably wants to be a combinational component. For example, match_element could probably perform the computation it need in a combinational manner and be invoked in the group that needs to write to the memory.
  2. There might be legitimate uses of "pass-by-reference" component but we don't have a defined semantics for such things precisely because typing for signals is so loose. Really, when we get component signals provided to us, we need to know their latency and relationship (in a manner captured by Filament). For example, in a Filament-typed Calyx, we would say that mem_write_en and mem_done are related and n cycles after writing to mem_write_en, mem_done will become high.

rachitnigam avatar Aug 11 '21 04:08 rachitnigam

Got it; thanks for the explanation! Perhaps stating the obvious again, but this sort of "pass-by-reference" thing needn't be exclusive to combinational situations… you can imagine the same setup being relevant in a situation where everything is sequential. You won't end up with a combinational cycle then (?), but maybe the consequences are still weird.

sampsyo avatar Aug 12 '21 00:08 sampsyo

Yeah, that seems to be the case---sequential implementation of "pass-by-reference" works fine. Maybe we should require that the group using mem_done still uses a component-defined cell for its done signal but can use mem_done internally. At least this way, the group will have a well-defined done condition? Maybe I'm wrong about the last thing. I think it'd be useful to just discuss this whole mess together and see what principles fall out.

rachitnigam avatar Aug 12 '21 03:08 rachitnigam

Huh, yeah. Enthusiastic yes to a broader conversation; this is a tricky issue.

If the issue is mainly confined to the combinational case, there's a chance the underlying issue is not so much go/done signals in specific as combinational paths in general. I understand that, in "normal" RTL design, one generally has to take care to know which ports on a given module are combinational connected to which other ports. When these combinational paths exist, they create an obligation on the client module to ensure that the signals eventually get registered (or else re-exposed as combinational paths) to avoid accidentally creating inter-module combinational cycles. Or, a simpler/practical policy is that modules should never expose ports with combinational paths between them, i.e., every path from an input port to an output port should include at least one register.

I guess I'm optimistic that we can somehow confine the confusion by creating a special designation for "combinational components" and requiring "normal" components to be sequential.

sampsyo avatar Aug 12 '21 11:08 sampsyo

Another ponderous solution to this problem, or rather a guarantee that should possibly be provided by the compiler, is that there is no combinational path between a component's go and done signals.

rachitnigam avatar May 05 '22 19:05 rachitnigam