spinnaker_tools
spinnaker_tools copied to clipboard
Add router diagnostic for "lost" local-sourced packets
One of the things that doesn't currently get counted in the router is packets that are sent by a local core but without routing entries. As these are not counted, it is possible for packets to simply "go missing"; default routed external packets will at least show up in the external multicast packets as I understand it. It could be useful to add to the default counters set up by SC&MP.
I believe that the settings of the counter should be "Local packet", "default routed" and "Multicast" (this seems to work when I tested it) - the hex value is 0x1ff75f1.
If it is considered an application issue, we can add it to our application instead, but it seemed like a general application diagnostic issue.
its worth noting that during the debug session of the external devices, this became very apparent as packets are sent but not lost (im in a debug session as i type) and this was due to the key being fired by the command sender not being tied to the router entry. Therefore this behaviour occurred, but nothing reported lost packets, and is likely why we've not noticed till now how untied the command sender has become with the rest of the tools.
The code for creating this filter is now in place on the fec branch command_sender_checks and is tied directly into the provenance gathering tools. If this becomes a general application diagnostic tool, the fec branch would only need minimal changes to support getting it from a defined register, instead of the user3 (which it currently uses for this filter).
An additional thought is that we should also have a counter for external multicast default routed packets. This can then be used to detect errors i.e. if my routing table on a chip has no expected default routes and there are default routed packets going through it, there must be an error.
just to add, this is now also added to the router diagnostic user2 via the command_sender_checks branch of FEC.
Its worth noting that a thought is that we're now forcing user 3 and 2 to be these filters, and stopping any other user from using them. I would think that we could possibly remove all filters from sark / scamp and have the top level tools set them. This would allow us to track when stuff is setting registers and adapt to it. Wed need to tie in the mapping between provenance data items and their registers so that we can extract them from the given registers properly.
It is worth noting that you can already do this - you don't have to keep the initial values. The only issue is that this has to be set up on every chip; maybe an extra command could be added to SC&MP to ask for this to be sent to all chips (flood fill routing diagnostics).
i appreciate that we can rewrite them, but if you want to verify between the defaults vs someone else setting them from the get go, you'd find it much easier if they were by default empty, instead of by default having 12 of them set. Esp as those ones can be reset at any given time period. just think a user who moved filter 4 onto filter 7, would we detect this? should we have detected this?
i do agree with you, a flood fill would be much easier, the code currently operates during the loading of each router table, so we're getting that loop for almost free, a setter once would be faster........ though the issue of multiple sims on the same machine view might mean you want a controlled flood, not a all chips flood, like the region stuff from ybug.
I would suggest that our software just writes those that it wants and tells the user which ones it doesn't use; it would then be up to the user to ensure they aren't conflicting with our own. The list of the ones in SC&MP is accessible, so you can do a direct comparison of the word if you want to though.