Gut icon indicating copy to clipboard operation
Gut copied to clipboard

Test listed as passed if the test execution encounters a script error

Open founderio opened this issue 5 years ago • 14 comments

Given the following test case:

func test_updown_signal_includes_column_index():
	var cut = component_template.instance()
	add_child_autofree(cut)
	watch_signals(cut)

	cut.setSize(3, 3)
	cut.set_column_index(2)

	var up = cut.find_node("Up")
	var down = cut.find_node("Down")

	util.do_a_left_click(util.get_control_center(down))
	assert_signal_emitted_with_parameters(cut, "pressed", [2, 1])

	util.do_a_left_click(util.get_control_center(up))
	assert_signal_emitted_with_parameters(cut, "pressed", [2, 0])

I see this output:

res://test/integration/inventory/test_four_ways_component.gd
[...]
* test_updown_signal_includes_column_index
SCRIPT ERROR: test_updown_signal_includes_column_index: Invalid call. Nonexistent function 'set_column_index' in base 'MarginContainer (dd_single_4-ways_component.gdns)'.
   At: res://test/integration/inventory/test_four_ways_component.gd:64.
* test_mouse_click_on_leftright_raises_signal_with_coordinates
* test_leftright_signal_includes_row_index
SCRIPT ERROR: test_leftright_signal_includes_row_index: Invalid call. Nonexistent function 'set_row_index' in base 'MarginContainer (dd_single_4-ways_component.gdns)'.
   At: res://test/integration/inventory/test_four_ways_component.gd:104.
* test_leftright_is_clamped_to_size_and_zero
SCRIPT ERROR: test_leftright_is_clamped_to_size_and_zero: Invalid call. Nonexistent function 'get_column' in base 'MarginContainer (dd_single_4-ways_component.gdns)'.
   At: res://test/integration/inventory/test_four_ways_component.gd:124.
11/11 passed.

cut is a script implemented with GDNative, which does not have a method named set_column_index. This means, the assertion would fail, if it were executed.
It looks like the script error breaks out of the test function, and as there were not failed assertions, the test "passed".

If I just remove the offending method call you can see the difference:

func test_updown_signal_includes_column_index():
	var cut = component_template.instance()
	add_child_autofree(cut)
	watch_signals(cut)

	cut.setSize(3, 3)
	#cut.set_column_index(2)

	var up = cut.find_node("Up")
	var down = cut.find_node("Down")

	util.do_a_left_click(util.get_control_center(down))
	assert_signal_emitted_with_parameters(cut, "pressed", [2, 1])

	util.do_a_left_click(util.get_control_center(up))
	assert_signal_emitted_with_parameters(cut, "pressed", [2, 0])
res://test/integration/inventory/test_four_ways_component.gd
[...]
* test_updown_signal_includes_column_index
    [Failed]:  Expected object [MarginContainer:1738](res://bin/dd_single_4-ways_component.gdns) to emit signal [pressed] with parameters [2, 1], got [0, 1]
      at line -1
    [Failed]:  Expected object [MarginContainer:1738](res://bin/dd_single_4-ways_component.gdns) to emit signal [pressed] with parameters [2, 0], got [0, 0]
      at line -1
* test_mouse_click_on_leftright_raises_signal_with_coordinates
* test_leftright_signal_includes_row_index
SCRIPT ERROR: test_leftright_signal_includes_row_index: Invalid call. Nonexistent function 'set_row_index' in base 'MarginContainer (dd_single_4-ways_component.gdns)'.
   At: res://test/integration/inventory/test_four_ways_component.gd:104.
* test_leftright_is_clamped_to_size_and_zero
SCRIPT ERROR: test_leftright_is_clamped_to_size_and_zero: Invalid call. Nonexistent function 'get_column' in base 'MarginContainer (dd_single_4-ways_component.gdns)'.
   At: res://test/integration/inventory/test_four_ways_component.gd:124.
11/13 passed.

I would expect the script error to cause the same result, i.e. count the aborted test as failed.
As it is right now, this causes false-positives unless you carefully read the log.

founderio avatar Aug 30 '20 16:08 founderio

Unfortunately Godot does not provide a way for GUT to detect Script errors. If you run in debug mode (-d from command line) then Godot should break on the line that errors out. If you are expecting certain behavior on the object then I would write an additional test that verifies it has the expected methods so that a failure is visible.

bitwes avatar Aug 31 '20 21:08 bitwes

Bummer. I intend to run this as part of a continuous integration pipeline (works just fine with the server build of godot), so any kind of interactivity is not really an option. For local test execution, sure, this works (somewhat) but not in CI. A hack may be to just parse the output and check for SCRIPT ERROR..

Any upstream feature request to get this kind of functionality into godot which I can follow or has this not been requested yet?

founderio avatar Aug 31 '20 23:08 founderio

I don't know of any plans to. implement it but I haven't really looked. I can see reasons for and against.

To detect this, parsing the output would be your best bet. You could run in debug mode and detect the hang the same way (probably). This would be trickier but would stop the run as soon as possible.

bitwes avatar Aug 31 '20 23:08 bitwes

The closest I have found so far is this thread: https://godotengine.org/qa/40779/run-dynamical-script-with-possible-errors

It mentions Expressions (not helpful..), but other than that everything just points to exception handling which won't be in godot, period.

However, I've found a workaround for the immediate problem.
To ensure we run to the end of a function, simply return a check-value and assert that. If the function ends prematurely due to a script error, it will return NULL.

Example code:

extends "res://addons/gut/test.gd"

const PASS = 1

const component_template = preload("res://inventory/text_item.tscn")

func test_focus_entered_emits_signal():
	assert_eq(focus_entered_emits_signal(), PASS, "Test ended prematurely - check log for script errors")

func focus_entered_emits_signal():
	var cut = component_template.instance()
	add_child_autofree(cut)

	watch_signals(cut)

	cut.onFocusEntered() # This does not exist (renamed :P )

	assert_signal_emitted_with_parameters(cut, "selected", [0, 0])

	return PASS

Result:

*** Run Summary ***
res://test/integration/inventory/test_text_item.gd
- test_focus_entered_emits_signal
    [Failed]:  [Null] expected to equal [1]:  Test ended prematurely - check log for script errors
          at line -1

Would it be a possibility to include something like this in GUT?
For backwards compatibility I imagine a global switch which can be flipped in a setup function to enable this behaviour.
Or is it possible to utilize optional typing when calling the tests and differentiate with the "new" test methods having an explicit return type?

founderio avatar Sep 06 '20 13:09 founderio

I feel including a new line in each test is too much work for this feature to be practical. Once you've discovered a possible runtime error, this looks like a decent way to test for it though.

I've thought about this a lot over time and short of Godot giving access to some counter, I don't think there is a good solution. If you think of anything else though, please share.

bitwes avatar Sep 08 '20 20:09 bitwes

Closing this out, reopen if you have. any other. ideas.

bitwes avatar Sep 16 '20 02:09 bitwes

This is the least invasive solution I can think of:
We can handle this via a custom return type, which is optional. Solution draft follows, not a fully fleshed out solution yet.

Prerequisite, have something like this somewhere:

class TestResult:
	var type = 0

	func _init(type):
		self.type = type

var PASS = TestResult.new(1)
var FAIL = TestResult.new(-1)

Tests without this additional logic, remaining unchanged:

func test_set_text_updates_label():
	# [...]
	assert_eq(got, "Unit Test Text")

Tests with this additional logic:

func test_set_get_text() -> TestResult:
	# [...]
	assert_eq(got, "Unit Test Text")

	return PASS

When the test methods are populated, we can differentiate methods with return type from the methods without return type: image image

I have yet to see what these values actually are, but they surely match some available value somewhere.

So if we hook into _populate_tests in test_collector.gd, the additional check can be enabled if, and only if, the test function has the return type TestResult. If it does not have that special return type, nothing changes.
This feature could be enabled via an additional flag which is later evaluated in gut.gd's _test_the_scripts.
Around the area of if(_is_function_state(script_result)): (line 742) an extra check would be required, similar to the assertion I used in the examples above.

Similarly , an assert_passes(call_something_else()) or an assert_does_not_pass(call_something_else()) may be useful for subroutines, for better readability than assert_eq(PASS, call_something_else()).

founderio avatar Sep 25 '20 21:09 founderio

This could work. It still seems like extra work for the user since adding the extra line would be tedious and if you only used when you thought there might be a problem then you should just write another test.

Another idea, which just occurred to me, is to essentially partial_double all the test scripts. In the "double", each test method would get wrapped with a return:

func test_foo():
  .test_foo()
  return PASS

This enables the check for all test methods and could allow for some other functionality in the future. It could make debugging a nightmare though and it seems like there could be a bunch of other issues. It is probably worth trying though.

bitwes avatar Sep 28 '20 18:09 bitwes

Hm.. if I understand correctly the doubling would call/wrap the original test method, not actually modify it.
Meaning it won't be able to determine if that one was handled correctly - as it may or may not have completed sucessfully.
The detection will only work on one level - so in your example the only thing we can ensure is that .test_foo() exists and did.. "something". A script error inside .test_foo() would look exactly the same as before, just with an additional level of indirection.

If the "default" behaviour of returning PASS is to be added automatically, we would need to actually modify the original test method. Change the signature and inject that return PASS at the end.
Not sure if that is feasible with GDScript?

founderio avatar Sep 29 '20 22:09 founderio

Dang-it. Should have thought it out juuuuust a bit more before typing it. Injection isn't possible. We could rewrite the code dynamically instead of wrapping calls but that sounds really gross to do and would mess up line numbers in output.

bitwes avatar Sep 30 '20 01:09 bitwes

When #184 (asserting in after_all) is fixed you could check that all tests in a script finished. You could even check if each test passed. This wouldn't occur until the end of the script but you would still end up with a failing assert (given #184 is fixed).

Note: This approach uses some undocumented functionality that should probably get some wrappers before I could endorse doing it, but here it is none-the-less.

extends 'res://addons/gut/test.gd'

# holds the names of all tests that made it to the end.
var _finished_tests = []

func _simple_check():
	assert_eq(_finished_tests.size(), gut.get_current_script_object().tests.size())

func _check_each():
	var tests_ran = gut.get_current_script_object().tests
	var not_finished = []

	for i  in range(test_ran.size()):
		var idx = _finished_tests.find(tests_ran[i].name)
		if(idx == -1):
			not_finished.append(tests_ran[i].name)

	assert_eq(not_finished.size(), 0, str("tests didn't finish ", not_finished))

func after_all():
	# call _simple_check or _check_each here.			

func test_foo():
	var a  = 'a'
	if(a == 1): # generates script error
		assert_true(true)
	# This line is required at the end of each test.
	_finished_tests.append(gut.get_current_test_object().name)

bitwes avatar Sep 30 '20 16:09 bitwes

Would be nice to make it works with yield too.

o01eg avatar May 05 '21 05:05 o01eg

Risky tests are a thing in GUT 9.x

I just ran some tests locally for a game without the -d option and noticed that tests that had script errors that prevented any asserts from running were marked "risky". Then I remembered this was something I've added to the Godot 4 version of GUT and I haven't documented it yet. The risky count is not included in the xml output (and I have to figure out if the spec allows for it).

pending tests are included in the risky count, so if you only look at the risky count you will fail builds that have pending tests. You can work around this by comparing the risky count to the pending count.

Right now, the only way I can think to leverage this in a pipeline would be to get this info in a post-run hook and set the exit code if you see any risky tests.

# Not tested
extends GutHookScript

func run():
    var totals = gut.get_summary().get_totals()

    # include pending tests as risky
    if(totals.risky > 0):
        print("Found ", totals.risky, " risky/pending tests")
        set_exit_code(1)

    # exclude pending tests 
    if(totals.risky != totals.pending):
        print("Found ", totals.risky - totals.pending, " risky tests")
        set_exit_code(1)

Caveats

Tests that assert something before a script error occurs will not be caught by this.

Tests that cause an error when asserting should still be marked risky, but I haven't verified this. I can't think of any scenarios where this wouldn't be the case, but I don't remember how a lot of GUT works sometimes.

Some of the complex asserts like assert_property that cause an error may not be marked risky. These asserts will perform multiple asserts internally so if any of them pass before/after then the test won't be marked risky.

bitwes avatar Dec 19 '23 19:12 bitwes