Add counters to see the number of stalls and flushes during the code execution
Hello, I've written a basic implementation of this feature for rv5s. Currently the other CPUs simply report no stalls or flushes, so this is likely something we'd want to work on, possibly before merging? I was wondering if anyone could take a look at it, and tell me whether I should submit a PR: https://github.com/matms/Ripes/commit/f939da079db3c6744d28a99fa66a73665151bf8d.
(Note that I'm not actually counting flush "events", rather I simply consider each instruction flushed as a cycle "lost" due to that flush, and report that instead.)
Hi @matms
Thank you for looking into this. I'm wondering whether a better approach would be to use the StageInfo data already exposed through the processor interface - this is what is used to generate visualizations showing flushes/stalls.
This could be queried every cycle to determine stalls and flushes. This then leads to the question of what defines a stalled/flushed stage and how is it uniqued (or in other words, if this interface is all we have, how to we make sure that we don't record stalls/flushes more than once?). For now, it would probably be sufficient to just assume that whenever the last stage is stalled/flushed, then that increments the counter. StageCount is accessed through this function.
Let me know what you think!
I've made a quick change to now use StageInfo https://github.com/matms/Ripes/commit/b6a30c53e1c7f4fbbeeeed1bc78b98a9782f1998.
Also, I suppose it could be a good idea to change clockProcessor() (and reverse())'s handling of m_instructionsRetired to also use stageInfo (stageValid), but I'm not sure.
I've made a quick change to now use StageInfo
What i was thinking is to not have explicit getStallCount and getFlushCycleCount functions in the processor interface. Having this here implies that the processors themselves are responsible for tracking this statistic. Instead, the processorinterface is intended to just provide hooks into the current state of a processor model; The rest of Ripes (GUI) is then responsible for using those hooks to do interesting things - one of which could be to keep a count of the # of stalls/flushes (As you hint to, having # of cycles/instructions retired tracked by the processor is therefore bad design, and should be fixed).
So a better solution would be to mimick the design of the rest of the GUI (pipeline drawing, cache simulator, ...); that is, a bunch of stuff which gets notified whenever the processor was clocked (https://github.com/mortbopet/Ripes/search?q=processorclocked) and from there maintains state locally. You could therefor implement such a function that upon processor clocking, updates flush/stall counts and reports this to the GUI.
I've pushed an update doing this (https://github.com/matms/Ripes/commit/ef0788d03e0b96c365561091ab1ea0d8a44d46dd). Some refactoring may still be needed, but I would like feedback on whether it is worth changing any of the existing code. In particular, I attempted to change instructionsRetired handling, but I ran into the issue that stageInfo tells us at best what is in WB right now, not what just left WB, so attempting to keep track of this in ProcessorTab would produce a change in behaviour (the count would be 1 higher than with the current method). Perhaps I could have ProcessorTab cache the last retired instruction's info, but I wonder if this is the best design choice.
Yea, the complexity lies in how to design the measurement in such a way as that it is generic and works with whatever is provided in the interface. In general, the UI should be agnostic about processor implementation. If this seems to be impossible while trying to develop, then that is probably an indicator of the processor interface lacking something.
In any case, please do submit your code as a PR - it's a bit easier to work through the review in that manner. Also, please keep the implementation of flush/stall counters and modifying existing logic in separate PRs so things are in a manageable, reviewable, and atomic scope!