no input injection to H-module

Open takerum opened this issue 5 months ago • 2 comments

Hi, I’m really impressed by your work, and I have a quick question about the HRM network. I’d love to understand more about the motivation behind one particular design choice.

In your model, the input embedding is only added to the L-module branch and not to the H-module branch. I’m wondering if there is any particular reason for this design choice. Have you found through experiments that injecting the embedding into the H-module degrades performance?

Aug 12 '25 23:08 takerum

Lower level details end up confusing the slow strategic module, no? I can't imagine the CEO reviewing every single line of code his engineering team generates.

Aug 13 '25 06:08 narvind2003

Makes sense

Also, I suspect this V1 is focused on simplicity

Aug 14 '25 00:08 kroggen