godot-benchmarks icon indicating copy to clipboard operation
godot-benchmarks copied to clipboard

Regression testing script

Open myaaaaaaaaa opened this issue 2 years ago • 2 comments

On my system, rendering seems to be fully reproducible, down to producing the same exact byte sequence given the same input, as long as it's rendering in single threaded mode (multithreading mode can sometimes change imperceptibly between runs and may require a fuzzy tester).

The following script leverages this property to test if the working tree contains rendering regressions via checksums, which is immediately helpful for anyone working on Godot's rendering engine, and can be useful as a starting point for a proper visual regression testing system.

This may also be relevant to https://github.com/Calinou/godot-rendering-tests

Requires #33 in order to allow every benchmark to run for an exact number of frames Requires https://github.com/godotengine/godot/pull/72689 for environment variable support

#!/usr/bin/bash -ex

##### Usage:
# cd ~/godot/
# ~/godot-benchmarks/name-of-this-script.sh [scons-args...] 
benchdir=$(dirname "$0")



rm -f bin/godot*

git stash
scons "$@"
mv bin/godot.* bin/godot_baseline

git stash pop --index
scons "$@"
mv bin/godot.* bin/godot_changed

#if the working tree is clean, maybe build HEAD^ and HEAD instead?



export GODOT_THREADING_WORKER_POOL_MAX_THREADS=1
export GODOT_DISPLAY_WINDOW_SIZE_VIEWPORT_WIDTH=640
export GODOT_DISPLAY_WINDOW_SIZE_VIEWPORT_HEIGHT=360
rm -rf /tmp/a/ /tmp/b/
mkdir -p /tmp/a/
mkdir -p /tmp/b/
bin/godot_baseline --path "$benchdir" --disable-vsync --fixed-fps 5 --write-movie /tmp/a/test.png -- --run-benchmarks --run-while='frame<5' --include-benchmarks='rendering/*'
bin/godot_changed  --path "$benchdir" --disable-vsync --fixed-fps 5 --write-movie /tmp/b/test.png -- --run-benchmarks --run-while='frame<5' --include-benchmarks='rendering/*'



cd /tmp/a/
shasum *.png >../test.sha
cd /tmp/b/
shasum -c ../test.sha

myaaaaaaaaa avatar Jun 25 '23 16:06 myaaaaaaaaa

The following script, which can be run after the above, generates a report on all rendering discrepancies found and saves it as a directory of lossless animated webp files:

#!/usr/bin/bash -ex

rm -rf   /tmp/c/
mkdir -p /tmp/c/

cd /tmp/a/
for png in *.png
do
	if ! compare $png /tmp/b/$png -metric ae /tmp/diff.ppm
	then
		convert -delay 50 \
			'('        $png /tmp/diff.ppm -append ')' \
			'(' /tmp/b/$png /tmp/diff.ppm -append ')' \
			-define 'webp:lossless=true' /tmp/c/$png.webp
	fi
done

See the example report below (converted to a gif for github) that was generated from the following sample regression:

 	SafeNumeric<uint64_t> num_keep;
 
-	for_range(p_from, p_to, true, SNAME("RenderCullInstances"), [&](const int i) {
+	for_range(p_from, p_to/2, true, SNAME("RenderCullInstances"), [&](const int i) {
 		KeepInstance keep_instance;

regression

myaaaaaaaaa avatar Jun 25 '23 18:06 myaaaaaaaaa

This looks very interesting :slightly_smiling_face: Thanks for your work on this!

I'll take a look in the future as I'm currently focused on PR reviews for the 4.1 release.

Calinou avatar Jun 26 '23 07:06 Calinou