Regression methodology

Open jotego opened this issue 1 year ago • 4 comments

Define a regression methodology so

Core compilation

All cores should be recompiled when

$JTFRAME/src/jtframe change

Ideally, changes that only affect some cores should be detectable, and those cores recompiled. But, another possibility is just to recompile everything.

Success is defined by the generation of the RBF file. Whether the file works or not is not checked.

This is currently implemented as daily builds in GitHub and builds for pull requests.

MRA/JSON generation

MRA files should be regenerated when

doc/mame.xml changes
$JTFRAME/src/jtframe/mra changes

The MRA/JSON files should be compared with those in JTBIN and checked for

changes in default DIP values
changes in DIP bit position
changes in the ROM md5 sum

Core simulation

All cores can have a regression list that can be run to verify that the core boots up correctly

List all simulations in a single yaml file in the cores/corename/ver folder
Each simulation should have its own subfolder inside ver
Either have a specific sim.sh script in the test folder or have the parameters for jtsim specified in the regression file
Regression tests should print either PASS or FAIL on the terminal
A new jtframe command or a new bash script should run all the regression tests
Regression tests should be runable locally (taken advantage of the parallel linux command)
Regression tests should be runable as a single GitHub action with multiple outputs
The regression command should be able to grab a file from archive.org and run the mra tool to generate a valid .rom file for the test
Support for iverilog and verilator simulations

Dec 21 '22 07:12 jotego

Hi! I reflect here to your mail. It would be good of course, but I don't see how can be done. For example, regressions in some 8-bit computer cores is this demo glitches on a scene or not. Automatic testing it on an FPGA is unimaginable. Do regular screenshots (how can it be done from the SDRAM, when the controller is sometimes maxed out for the specific core?), and compare it with good values? Even capturing from BRAM could be tricky if both ports already used.

It's more plausible to do some test in simulation, like in Verilator you can check any state in any moment, also inject any signals, etc. But it's slow, even if you target only a single subsytem. And tests should be written already during module development, not what developers (including me...) like to do. It basically doubles the development effort, which is not very motivating for a hobbyst like me. For more serious things (what you do) it can pay off. However even thinking about test cases are not trivial - most things comes out only when you run the actual software. If I identify what the software do, do some fix about it - here's a point when a test case should be added. But emulating the software without the actual software is also can be cumbersome (like simulating port writes in a defined pattern).

For continuous shared component development, I like to make them backward compatible as possible, adding new breaking features into module parameters or ifdefs. I avoid rewriting modules. But I don't maintain common modules with such complexity as you.

Sep 13 '23 11:09 gyurco

About shared modules I have a principle which I first encountered in another project: don't make a module shared until 3 different systems can use it. Then if it's good for 3 users, then there's a chance the design sounds and the interface is good.

Sep 13 '23 11:09 gyurco

Hi! I reflect here to your mail. It would be good of course, but I don't see how can be done. For example, regressions in some 8-bit computer cores is this demo glitches on a scene or not. Automatic testing it on an FPGA is unimaginable. Do regular screenshots (how can it be done from the SDRAM, when the controller is sometimes maxed out for the specific core?), and compare it with good values? Even capturing from BRAM could be tricky if both ports already used.

Indeed, evaluating the output is a problem by itself. But, there are some approaches to this problem already stablished. For instance, you can get a signature of the video stream. Imagine some kind of CRC code run through each frame, which reduces the amount of information per frame to just 32 bits. You can store the right CRC for the 5-10 minutes of video in the MRA file and have the core compared against it while it runs.

That is just an example of a way to accomplish the PASS/FAIL assesment

It's more plausible to do some test in simulation, like in Verilator you can check any state in any moment, also inject any signals, etc. But it's slow, even if you target only a single subsytem. And tests should be written already during module development, not what developers (including me...) like to do. It basically doubles the development effort, which is not very motivating for a hobbyst like me. For more serious things (what you do) it can pay off.

I have part of that workflow done. I think it is necessary but not enough because some times you don't see the error in Verilator. For instance, there are bugs sensitive to when something occurs. Like when you have two CPUs talking to each other and the original software is a bit too sensitive to the exact timing. Then it may work on simulation but not on synthesis.

Another problem is that the simulation does not cover the interaction with the firmware or, in my case, large parts of the FPGA logic that interacts with that firmware. My verilator sims basically exercise the core game module, not the real top level.

For continuous shared component development, I like to make them backward compatible as possible, adding new breaking features into module parameters or ifdefs. I avoid rewriting modules. But I don't maintain common modules with such complexity as you.

Of course, yes. I try to keep compatibility as much as possible. I normally breaks things by mistake, not by design :-)

If the FPGA platform (the firmware) supports some way of remote control from a PC, to start a core and transfer files, then the rest of the regression setup can be done from the PC and the core design, I think.

Sep 14 '23 10:09 jotego

If the FPGA platform (the firmware) supports some way of remote control from a PC, to start a core and transfer files, then the rest of the regression setup can be done from the PC and the core design, I think.

MiST has some basic control implemented using the USB-serial line, and there's an XMODEM style file upload interface, too (quite old school) - but never used it myself.

Sep 14 '23 16:09 gyurco

jtcores jtcores copied to clipboard

Regression methodology

Core compilation

MRA/JSON generation

Core simulation

jtcores
jtcores copied to clipboard