sst-elements icon indicating copy to clipboard operation
sst-elements copied to clipboard

Example of Merlin mesh topology?

Open afranques opened this issue 4 years ago • 10 comments

Hello, I haven't had much luck on finding an example of the Merlin mesh topology. Could please anyone share an example of how to use it?

An ideal example would be:

  • A 4 by 4 mesh --> 16 tiles in total, as shown in Figure (b)
  • Each tile contains one router (I assume), and each router has 2 local ports: one for a private L1, and another for a slice of the LLC (L2), as shown in Figure (a)

Tiled-chip-multiprocessors-a-Diagram-of-an-individual-tile-b-16-Way-tiled-CMP-with_W640

What I'm mostly interested is in figuring out how to set up each of the endpoints to the mesh, in tiles. I looked at the Merlin tests examples for other topologies (there's none for mesh), but it simply calls topo.setEndPoint(endPoint). I looked at the code of function setEndPoint and the related ones, but I couldn't get much out of it.

Another question that I also have is: if the number of local ports should be the same for each router, how do I connect the memory controllers to the mesh? Ideally I would like two of the tiles to have 3 local ports instead of 2, so that besides the L1 and LLC-slice, I can connect the mem controller to those as well. Since I assume this is not possible with Merlin, the alternative would be to use 2 of the tiles in the mesh exclusively to connect the memory controllers. However, if I do that, then I would only have 14 tiles that I could use for cores, since 2 of them would be used exclusively for the mem controllers. Having a non-exponential number of cores doesn't work very well with some benchmarks, so this would complicate things for me.

Thank you very much in advance to anyone who takes the time to reply.

afranques avatar May 28 '20 22:05 afranques

I can give you a basic idea. @feldergast is the main author of Merlin. @gvoskuilen has done the most with detailed NoC simulations.

It sounds like you want to do a detailed simulation of memory traffic on a NoC (as opposed to simulation of MPI traffic). The endpoint method just gives a mechanism for specifying different types of endpoints. In your case, you want a detailed processor model like Ariel to be your endpoint (the other option is usually Ember).

Have a look at ariel_snb_mlm.py in the sst-elements repo. It shows how to set up an ariel simulation (which generates memory traffic), a memNIC (which submits memory requests onto the NoC through the SimpleNetwork interface), and sets up a merlin topology.

jjwilke avatar May 28 '20 23:05 jjwilke

If you are not tied to using Merlin, an alternative would be Kingsley for the network. The high level differences are that Kingsley only supports mesh, doesn't have output buffering (input buffering only), but it supports unconnected ports. Look at: sst-elements/src/sst/elements/memHierarchy/tests/testKingsley.py for an example.

gvoskuilen avatar May 28 '20 23:05 gvoskuilen

Also, for Merlin I have done it this way (pseudo-code):

class PopulatedPort: def build(self, nodeID, extraKeys): # build component(s) and return the network-connected port

class UnpopulatedPort: def build(self, nodeID, extraKeys): pass

def buildCore(id): if id is a populated port, return PopulatedPort() else return UnpopulatedPort()

topo.setEndPointFunc( buildCore )

gvoskuilen avatar May 28 '20 23:05 gvoskuilen

Thank you very much for your prompt replies, @jjwilke and @gvoskuilen! After spending a while looking at the Merlin torus setup in ariel_snb_mlm.py, and the Kingsley mesh setup in testKingsley.py, things are getting a lot more clear. Regarding the Merlin setup, would you mind shedding some more light into the following 2 questions?

Question 1: In ariel_snb_mlm.py, how come ringstop_params show "torus:width":"1" (1 link between routers of the same dimension), but then it sets up 2 links (positive and negative) between each pair of routers, as shown in the loop that starts in line 165, with code:

rtr_link_positive = sst.Link("rtr_pos_" + str(next_ring_stop))
rtr_link_positive.connect( (router_map["rtr." + str(next_ring_stop)], "port0", ring_latency), (router_map["rtr." + str(next_ring_stop+1)], "port1", ring_latency) )
rtr_link_negative = sst.Link("rtr_neg_" + str(next_ring_stop))
rtr_link_negative.connect( (router_map["rtr." + str(next_ring_stop)], "port1", ring_latency), (router_map["rtr." + str(next_ring_stop-1)], "port0", ring_latency) )

Question2: Also in ariel_snb_mlm.py, there's 4 network groups, and each group consists of 7 endpoints (2 L2s, 1 directory controller, and 4 slices of the shared L3), so 28 endpoints in total. I was expecting that each group of 7 endpoints would be connected into a single router of the torus, and then the torus would connect the 4 routers/groups. Instead, the code shows that each of the endpoints of each group is actually placed in a different router, with 28 routers in total. That's why the parameters show "torus:shape":<an expr that equals 28> (torus has 28 routers in a single dimension), "num_ports":"3" (each router has 1 port for the endpoint and 2 to link with its neighbor routers), and "torus:local_ports":"1" (each router only has 1 endpoint connected to it). So my question is, instead of creating 28 routers, could we have either of the following 2 options?:

  • Option 1: A single router per group (4 in total), where each router has the 7 endpoints connected to it, and then have a torus with links between each of the 4 routers. If so, I'm assuming the parameters would look like this?: "torus:shape":"4" (we now have only 4 routers, one per group), "num_ports":"9" (each router needs to connect to 7 endpoints, and to 2 neighbor routers), and "torus:local_ports":"7" (7 endpoints per router).
  • Option2: Declare each of the 4 routers as "topology":"merlin.singlerouter", each having all 7 endpoints of the group connected to it. Then create another 4 routers, but declared as "topology":"merlin.torus". Then create a 1-to-1 bridge between the merlin.singlerouter and merlin.torus routers (i.e. for i in range(4), connect singlerouter_i to torusrouter_i). Finally, create the links between the torus routers.

Even though the configuration that I'm bulding is for Merlin mesh instead of torus, I'm attaching what I have so far anyway, so that maybe you get a better idea of what I'm trying to do. For now I'm using Prospero as the CPU, since I believe it will be easier to debug with predefined traces, but after that I'll switch to Ariel.

Thank you very much again.

afranques avatar May 29 '20 10:05 afranques

  1. The width is just the number of links between connected routers - so you have 1 link between connected routers. If you had two links on each dimension, width would be 2. These parameters are actually documented in the torus.h header file. This documentation gets printed with the exe sst-info to print details of components.

  2. I think Option 1 is going to be easier. The challenge is getting the routing to work. You'll get traffic correctly routed with Option 1 since it's just a mesh with multiple injection ports. I'm not sure how the routing would work on Option 2 since it's no longer a mesh.

jjwilke avatar May 29 '20 15:05 jjwilke

Thanks again, @jjwilke.

  1. So then there's a bug in ariel_snb_mlm.py, since ringstop_params show "torus:width":"1", but then it sets up 2 links (positive and negative) between each pair of routers in line 165, correct?
  2. I'll go with Option 1 then. Also, I'm assuming it's not allowed to have different values of num_ports and local_ports for different routers of the Merlin topology, right? I'm asking because in my config file I have 2 tiles (out of 16) that besides the L1 and L2 will also have a mem controller connected to the router. I'm assuming in this case I need to set num_ports and local_ports to the max values I'll ever have, and simply have the mem controller port being unused in the rest of the routers/tiles?

afranques avatar May 29 '20 18:05 afranques

The positive and negative count as a single link. I guess that's a bit confusing. Each link (I think) sends data one direction and credits the other direction on a given link. So I guess the width is better described as # of bi-directional links.

And, yes, I believe the routers just do basic division dest_id / num_local_ports to find the dest router - so it's not set up for non-uniform routing. Your solution is probably the best way to do it.

jjwilke avatar May 29 '20 22:05 jjwilke

Hmm, I see. And I'm assuming the same is true for the Merlin mesh then. So, for any given pair of routers (for example rtr0 and rtr1), we need a pair of links such as:

# Link from port0 of rtr0 to port1 of rtr1
link_rtr0_rtr1 = sst.Link("link_rtr0_rtr1")
link_rtr0_rtr1.connect( (rtr0, "port0", latency), (rtr1, "port1", latency) )

# Link from port1 of rtr1 to port0 of rtr0
link_rtr1_rtr0 = sst.Link("link_rtr1_rtr0")
link_rtr1_rtr0.connect( (rtr1, "port1", latency), (rtr0, "port0", latency) )

So, just to confirm, it is meant to be that both links use port0 in rtr0, and port1 in rtr1?

afranques avatar May 29 '20 22:05 afranques

Hmm. Okay, I think I'm getting myself into trouble with code I didn't write and haven't read in a while. You might want to wait for a response from @feldergast. We actually do NOT have separate credit and data ports in portControl.cc...

Port 1 connects to the "-" switch. Port 0 connects to the "+" switch - so there are only two ports per switch and width is indeed 1. If anything, we might be instantiating different links on the same ports in Python (which wouldn't really affect the simulation, but should be an input error)? I will try to check this later. But looking at the Python again, there is a single port-port connection between routers.

jjwilke avatar May 29 '20 23:05 jjwilke

So, although we might want to wait for @feldergast, if I understood correctly your implication is that the link setup in ariel_snb_mlm.py might be wrong, and we actually don't have to create 2 links for a pair of routers, but only 1 (which is already bi-directional by default)?

I looked at testKingsley.py (line 506) and it seems Kingsley only uses 1 link for a given pair of routers. So it's probably safe to assume Merlin should do the same? This is what kingsley does:

for y in range(0, mesh_stops_y):
    for x in range (0, mesh_stops_x):

        # North-south connections
        if y != (mesh_stops_y -1):
            kRtrReqNS = sst.Link("krtr_req_ns_" + str(i))
            kRtrReqNS.connect( (kRtrReq[i], "south", mesh_link_latency), (kRtrReq[i + mesh_stops_x], "north", mesh_link_latency) )

        # West-east connections
        if x != (mesh_stops_x - 1):
            kRtrReqEW = sst.Link("krtr_req_ew_" + str(i))
            kRtrReqEW.connect( (kRtrReq[i], "east", mesh_link_latency), (kRtrReq[i+1], "west", mesh_link_latency) )

afranques avatar May 30 '20 01:05 afranques