Test listed as passed if the test execution encounters a script error
Given the following test case:
func test_updown_signal_includes_column_index():
var cut = component_template.instance()
add_child_autofree(cut)
watch_signals(cut)
cut.setSize(3, 3)
cut.set_column_index(2)
var up = cut.find_node("Up")
var down = cut.find_node("Down")
util.do_a_left_click(util.get_control_center(down))
assert_signal_emitted_with_parameters(cut, "pressed", [2, 1])
util.do_a_left_click(util.get_control_center(up))
assert_signal_emitted_with_parameters(cut, "pressed", [2, 0])
I see this output:
res://test/integration/inventory/test_four_ways_component.gd
[...]
* test_updown_signal_includes_column_index
SCRIPT ERROR: test_updown_signal_includes_column_index: Invalid call. Nonexistent function 'set_column_index' in base 'MarginContainer (dd_single_4-ways_component.gdns)'.
At: res://test/integration/inventory/test_four_ways_component.gd:64.
* test_mouse_click_on_leftright_raises_signal_with_coordinates
* test_leftright_signal_includes_row_index
SCRIPT ERROR: test_leftright_signal_includes_row_index: Invalid call. Nonexistent function 'set_row_index' in base 'MarginContainer (dd_single_4-ways_component.gdns)'.
At: res://test/integration/inventory/test_four_ways_component.gd:104.
* test_leftright_is_clamped_to_size_and_zero
SCRIPT ERROR: test_leftright_is_clamped_to_size_and_zero: Invalid call. Nonexistent function 'get_column' in base 'MarginContainer (dd_single_4-ways_component.gdns)'.
At: res://test/integration/inventory/test_four_ways_component.gd:124.
11/11 passed.
cut is a script implemented with GDNative, which does not have a method named set_column_index. This means, the assertion would fail, if it were executed.
It looks like the script error breaks out of the test function, and as there were not failed assertions, the test "passed".
If I just remove the offending method call you can see the difference:
func test_updown_signal_includes_column_index():
var cut = component_template.instance()
add_child_autofree(cut)
watch_signals(cut)
cut.setSize(3, 3)
#cut.set_column_index(2)
var up = cut.find_node("Up")
var down = cut.find_node("Down")
util.do_a_left_click(util.get_control_center(down))
assert_signal_emitted_with_parameters(cut, "pressed", [2, 1])
util.do_a_left_click(util.get_control_center(up))
assert_signal_emitted_with_parameters(cut, "pressed", [2, 0])
res://test/integration/inventory/test_four_ways_component.gd
[...]
* test_updown_signal_includes_column_index
[Failed]: Expected object [MarginContainer:1738](res://bin/dd_single_4-ways_component.gdns) to emit signal [pressed] with parameters [2, 1], got [0, 1]
at line -1
[Failed]: Expected object [MarginContainer:1738](res://bin/dd_single_4-ways_component.gdns) to emit signal [pressed] with parameters [2, 0], got [0, 0]
at line -1
* test_mouse_click_on_leftright_raises_signal_with_coordinates
* test_leftright_signal_includes_row_index
SCRIPT ERROR: test_leftright_signal_includes_row_index: Invalid call. Nonexistent function 'set_row_index' in base 'MarginContainer (dd_single_4-ways_component.gdns)'.
At: res://test/integration/inventory/test_four_ways_component.gd:104.
* test_leftright_is_clamped_to_size_and_zero
SCRIPT ERROR: test_leftright_is_clamped_to_size_and_zero: Invalid call. Nonexistent function 'get_column' in base 'MarginContainer (dd_single_4-ways_component.gdns)'.
At: res://test/integration/inventory/test_four_ways_component.gd:124.
11/13 passed.
I would expect the script error to cause the same result, i.e. count the aborted test as failed.
As it is right now, this causes false-positives unless you carefully read the log.
Unfortunately Godot does not provide a way for GUT to detect Script errors. If you run in debug mode (-d from command line) then Godot should break on the line that errors out. If you are expecting certain behavior on the object then I would write an additional test that verifies it has the expected methods so that a failure is visible.
Bummer. I intend to run this as part of a continuous integration pipeline (works just fine with the server build of godot), so any kind of interactivity is not really an option. For local test execution, sure, this works (somewhat) but not in CI.
A hack may be to just parse the output and check for SCRIPT ERROR..
Any upstream feature request to get this kind of functionality into godot which I can follow or has this not been requested yet?
I don't know of any plans to. implement it but I haven't really looked. I can see reasons for and against.
To detect this, parsing the output would be your best bet. You could run in debug mode and detect the hang the same way (probably). This would be trickier but would stop the run as soon as possible.
The closest I have found so far is this thread: https://godotengine.org/qa/40779/run-dynamical-script-with-possible-errors
It mentions Expressions (not helpful..), but other than that everything just points to exception handling which won't be in godot, period.
However, I've found a workaround for the immediate problem.
To ensure we run to the end of a function, simply return a check-value and assert that. If the function ends prematurely due to a script error, it will return NULL.
Example code:
extends "res://addons/gut/test.gd"
const PASS = 1
const component_template = preload("res://inventory/text_item.tscn")
func test_focus_entered_emits_signal():
assert_eq(focus_entered_emits_signal(), PASS, "Test ended prematurely - check log for script errors")
func focus_entered_emits_signal():
var cut = component_template.instance()
add_child_autofree(cut)
watch_signals(cut)
cut.onFocusEntered() # This does not exist (renamed :P )
assert_signal_emitted_with_parameters(cut, "selected", [0, 0])
return PASS
Result:
*** Run Summary ***
res://test/integration/inventory/test_text_item.gd
- test_focus_entered_emits_signal
[Failed]: [Null] expected to equal [1]: Test ended prematurely - check log for script errors
at line -1
Would it be a possibility to include something like this in GUT?
For backwards compatibility I imagine a global switch which can be flipped in a setup function to enable this behaviour.
Or is it possible to utilize optional typing when calling the tests and differentiate with the "new" test methods having an explicit return type?
I feel including a new line in each test is too much work for this feature to be practical. Once you've discovered a possible runtime error, this looks like a decent way to test for it though.
I've thought about this a lot over time and short of Godot giving access to some counter, I don't think there is a good solution. If you think of anything else though, please share.
Closing this out, reopen if you have. any other. ideas.
This is the least invasive solution I can think of:
We can handle this via a custom return type, which is optional. Solution draft follows, not a fully fleshed out solution yet.
Prerequisite, have something like this somewhere:
class TestResult:
var type = 0
func _init(type):
self.type = type
var PASS = TestResult.new(1)
var FAIL = TestResult.new(-1)
Tests without this additional logic, remaining unchanged:
func test_set_text_updates_label():
# [...]
assert_eq(got, "Unit Test Text")
Tests with this additional logic:
func test_set_get_text() -> TestResult:
# [...]
assert_eq(got, "Unit Test Text")
return PASS
When the test methods are populated, we can differentiate methods with return type from the methods without return type:

I have yet to see what these values actually are, but they surely match some available value somewhere.
So if we hook into _populate_tests in test_collector.gd, the additional check can be enabled if, and only if, the test function has the return type TestResult. If it does not have that special return type, nothing changes.
This feature could be enabled via an additional flag which is later evaluated in gut.gd's _test_the_scripts.
Around the area of if(_is_function_state(script_result)): (line 742) an extra check would be required, similar to the assertion I used in the examples above.
Similarly , an assert_passes(call_something_else()) or an assert_does_not_pass(call_something_else()) may be useful for subroutines, for better readability than assert_eq(PASS, call_something_else()).
This could work. It still seems like extra work for the user since adding the extra line would be tedious and if you only used when you thought there might be a problem then you should just write another test.
Another idea, which just occurred to me, is to essentially partial_double all the test scripts. In the "double", each test method would get wrapped with a return:
func test_foo():
.test_foo()
return PASS
This enables the check for all test methods and could allow for some other functionality in the future. It could make debugging a nightmare though and it seems like there could be a bunch of other issues. It is probably worth trying though.
Hm.. if I understand correctly the doubling would call/wrap the original test method, not actually modify it.
Meaning it won't be able to determine if that one was handled correctly - as it may or may not have completed sucessfully.
The detection will only work on one level - so in your example the only thing we can ensure is that .test_foo() exists and did.. "something". A script error inside .test_foo() would look exactly the same as before, just with an additional level of indirection.
If the "default" behaviour of returning PASS is to be added automatically, we would need to actually modify the original test method. Change the signature and inject that return PASS at the end.
Not sure if that is feasible with GDScript?
Dang-it. Should have thought it out juuuuust a bit more before typing it. Injection isn't possible. We could rewrite the code dynamically instead of wrapping calls but that sounds really gross to do and would mess up line numbers in output.
When #184 (asserting in after_all) is fixed you could check that all tests in a script finished. You could even check if each test passed. This wouldn't occur until the end of the script but you would still end up with a failing assert (given #184 is fixed).
Note: This approach uses some undocumented functionality that should probably get some wrappers before I could endorse doing it, but here it is none-the-less.
extends 'res://addons/gut/test.gd'
# holds the names of all tests that made it to the end.
var _finished_tests = []
func _simple_check():
assert_eq(_finished_tests.size(), gut.get_current_script_object().tests.size())
func _check_each():
var tests_ran = gut.get_current_script_object().tests
var not_finished = []
for i in range(test_ran.size()):
var idx = _finished_tests.find(tests_ran[i].name)
if(idx == -1):
not_finished.append(tests_ran[i].name)
assert_eq(not_finished.size(), 0, str("tests didn't finish ", not_finished))
func after_all():
# call _simple_check or _check_each here.
func test_foo():
var a = 'a'
if(a == 1): # generates script error
assert_true(true)
# This line is required at the end of each test.
_finished_tests.append(gut.get_current_test_object().name)
Would be nice to make it works with yield too.
Risky tests are a thing in GUT 9.x
I just ran some tests locally for a game without the -d option and noticed that tests that had script errors that prevented any asserts from running were marked "risky". Then I remembered this was something I've added to the Godot 4 version of GUT and I haven't documented it yet. The risky count is not included in the xml output (and I have to figure out if the spec allows for it).
pending tests are included in the risky count, so if you only look at the risky count you will fail builds that have pending tests. You can work around this by comparing the risky count to the pending count.
Right now, the only way I can think to leverage this in a pipeline would be to get this info in a post-run hook and set the exit code if you see any risky tests.
# Not tested
extends GutHookScript
func run():
var totals = gut.get_summary().get_totals()
# include pending tests as risky
if(totals.risky > 0):
print("Found ", totals.risky, " risky/pending tests")
set_exit_code(1)
# exclude pending tests
if(totals.risky != totals.pending):
print("Found ", totals.risky - totals.pending, " risky tests")
set_exit_code(1)
Caveats
Tests that assert something before a script error occurs will not be caught by this.
Tests that cause an error when asserting should still be marked risky, but I haven't verified this. I can't think of any scenarios where this wouldn't be the case, but I don't remember how a lot of GUT works sometimes.
Some of the complex asserts like assert_property that cause an error may not be marked risky. These asserts will perform multiple asserts internally so if any of them pass before/after then the test won't be marked risky.