widelands icon indicating copy to clipboard operation
widelands copied to clipboard

Regression tests for start and win conditions, fix race regressions

Open tothxa opened this issue 1 year ago • 33 comments

Type of change New feature / Bugfix

Issue(s) closed Re https://github.com/widelands/widelands/issues/6176#issuecomment-1751738993

Found and fixed some race condition regressions in master

New behavior There are now regression tests for all starting and win conditions, and also for the completeness of these. Incidentally, --new_game_from_template is tested now, hopefully with all possible game options used at least once.

Most Lua calls are now protected by the Objects and Lua locks. The order is important, see added TODO comments and testsuite locking failures like https://github.com/widelands/widelands/actions/runs/7886026322/job/21518487518

Lock debug logging is disabled when acquiring a lock for the logger.

Possible regressions Workflow failures (, locking?)

tothxa avatar Oct 10 '23 02:10 tothxa

Hmm, didn't work the same as for me locally... Didn't catch #6176 at all...

At least I have an idea that may solve the timeouts...

tothxa avatar Oct 10 '23 09:10 tothxa

OK, so it was that simulated merge-into-master thing. Now it had only the expected failure.

tothxa avatar Oct 10 '23 17:10 tothxa

It should be now good then.

tothxa avatar Oct 10 '23 17:10 tothxa

It succeeded again, but I'm not sure how it will behave the next time under stress. I really don't even understand how lua, its coroutines, and particularly sleep() work when the game is paused by the game summary window. But the cause of many timeout failures seemed to be that the check coroutine didn't run after the summary window had been opened, while it runs in 90+% of the tries.

tothxa avatar Nov 15 '23 01:11 tothxa

OK, so I have to solve the problem of reliably running a script while the game is paused or when the game summary window is opened. I'm not sure which would be cleaner.

tothxa avatar Nov 27 '23 09:11 tothxa

Maybe a stupid question: Isn't it possible to have a switch show_game_result which defaults to true but for these tests it can be set to false? And if it is false then don't pause the game, show the result in the console and quit immediately?

Show result in console may can help to automate AI-training.

frankystone avatar Dec 05 '23 08:12 frankystone

Maybe a stupid question: Isn't it possible to have a switch show_game_result which defaults to true but for these tests it can be set to false? And if it is false then don't pause the game, show the result in the console and quit immediately?

(Verbose) Logging end result reporting and setting as well as the end of the game is definitely a good idea. It's currently only done in the game host, but it should be added to the client and singleplayer too.

As for the tests, since I've finally managed to make them work reliably, I'd stick to my current approach because I think it's better to have tests that don't need active co-operation from the tested functions.

And the realtime tasks may come in handy for other uses too. Most notably the new UI plugin scripts, because actually, contrary to my code comments, my testing showed that for scripts run from the debug console, it's the old (gametime) sleep that causes an immediate desync even on a print("hello") (I guess because resuming coroutines is written to the syncstream), while the new realtime sleep works just fine. (obviously as long as the script doesn't manipulate the game state)

Show result in console may can help to automate AI-training.

Yes, for that use case we may need an option for game summary to exit instead of pause, in addition to the above mentioned logging.

tothxa avatar Dec 05 '23 21:12 tothxa

I guess plugin timer (of pr GH #6030) could be used instead of implementing real time scheduled commands. Those two are for a similar problem. The repeated called function would need to check for the win message, set a variable to indicate the end, and close the win message (and possibly set gamespeed) to let the normal code run. Then the normal code can simply use sleep().

Why is the map not in test/maps/ ? As long as there is no test/maps/XXX.wmf/scripting/test_*.lua, no other test will be created for it. (But there could be some, which would not interfere. And coming template tests would be encouraged to use existing maps.) To have the templates (X.wgt) and the script (X.lua) together in test/templates/ looks good. I see directly which ones belong together.

Some tests take long. (slowest is test_struggling_outpost-wood_gnome.lua with 2:30) Despite setting game.desired_speed = 100000 in test/scripting/check_game_end.lua.

(Using configparser from core python I suggest in the file view.)

Really a great idea to let the test script arrange the winner. :+1: :+1:

SimonHeimberg avatar Jan 10 '24 19:01 SimonHeimberg

Additional idea:

  • one test could have custom_starting_positions="false" to check that the AI selects a position. Currently the AI just selects the default position, but the test should not care about this. test_headquarters-defeat_all.wgt looks like a good candidate.
  • one test (or a few) could check if the number and settings of players is as expected. This would check functionality of --new_game_from_template.
  • one test could set random="true" (and ai=empty, ...) for an additional player and check if it is generated.

SimonHeimberg avatar Jan 10 '24 20:01 SimonHeimberg

tothxaMirrored from Codeberg On Thu Jan 11 01:10:19 CET 2024, Tóth András (tothxa) wrote:


Re https://github.com/widelands/widelands/pull/6182#issuecomment-1885614358 by @SimonHeimberg

I guess plugin timer (of #6030) could be used instead of implementing real time scheduled commands. Those two are for a similar problem.

Oops, indeed it looks so. Sorry, I haven't followed that PR closely. I'll try to convert to that, so this PR will depend on UI Plugins.

Why is the map not in test/maps/ ? As long as there is no test/maps/XXX.wmf/scripting/test_*.lua, no other test will be created for it.

I think that would be confusing. But I made the map initially in the hope that the AI would conquer the artifact on its own, but it turned out it fails more often than not, so now the test script conquers it. So I can look for a suitable standard map instead.

Some tests take long. (slowest is test_struggling_outpost-wood_gnome.lua with 2:30) Despite setting game.desired_speed = 100000 in test/scripting/check_game_end.lua.

I don't think they take unreasonable time in the workflow. I intentionally added some more time for wood gnome though to allow tree growth, but the high speed works for me.

bunnybot avatar Jan 11 '24 00:01 bunnybot

Why is the map not in test/maps/ ? As long as there is no test/maps/XXX.wmf/scripting/test_*.lua, no other test will be created for it.

I think that would be confusing. But I made the map initially in the hope that the AI would conquer the artifact on its own, but it turned out it fails more often than not, so now the test script conquers it. So I can look for a suitable standard map instead.

I doubt there is an existing map with at least 5 players and one artifact only.

SimonHeimberg avatar Jan 11 '24 18:01 SimonHeimberg

* one test could have `custom_starting_positions="false"` to check that the AI selects a position.

Trading outpost / collectors uses custom starting positions, the rest don't.

* one test (or a few) could check if the number and settings of players is as expected. This would check functionality of  `--new_game_from_template`.

I think the number is already tested when winners and losers are checked. I'll have a look into what settings we can check.

* one test could set random="true" (and ai=empty, ...) for an additional player and check if it is generated.

OK.

tothxa avatar Jan 11 '24 23:01 tothxa

tothxaMirrored from Codeberg On Fri Jan 12 20:55:23 CET 2024, Tóth András (tothxa) wrote:


OK, I managed to use the plugin timer feature from #6030, so I removed the realtime stuff. That's actually for the better for this PR. (I'll revert merging it soon of course, when I don't need more workflow runs.)

Some tests take long. (slowest is test_struggling_outpost-wood_gnome.lua with 2:30) Despite setting game.desired_speed = 100000 in test/scripting/check_game_end.lua.

The last run with only these tests shows they take less than 2 minutes all together in Release builds, but 6 to 8 minutes in Debug builds. As these are meant to only test things inside data/, I don't think we'd lose much coverage if we skipped them in Debug builds... But let's do that when we optimise tests in general for further workflow optimisations.

bunnybot avatar Jan 12 '24 20:01 bunnybot

one test could set random="true" (and ai=empty, ...) for an additional player and check if it is generated.

Done, except for the check. How should I check it?

tothxa avatar Jan 13 '24 20:01 tothxa

one test could set random="true" (and ai=empty, ...) for an additional player and check if it is generated.

Done, except for the check. How should I check it?

I made a pr to this branch with a suggestion how to test the player with a random tribe. And some more pr with other suggestions

SimonHeimberg avatar Jan 28 '24 10:01 SimonHeimberg

@SimonHeimberg I took over from your suggestions what I agree with. Thank you for them, some of them were major improvements.

Sorry about the rest.

tothxa avatar Feb 01 '24 00:02 tothxa

I believe the latest failures are due to a regression in master. Most likely introduced by #6153, because the failures only happen in the template tests and that has been the only PR touching the template startup code since the previous successes. (or actually in a very long time?)

tothxa avatar Feb 11 '24 01:02 tothxa

I believe the latest failures are due to a regression in master. Most likely introduced by #6153

I was completely wrong. The failures started after the naval warfare merge, but I didn't notice because I expected failure due to the macos issue. That's much harder to track then.

tothxa avatar Feb 12 '24 01:02 tothxa

tothxaMirrored from Codeberg On Mon Feb 12 15:21:57 CET 2024, Tóth András (tothxa) wrote:


<@>Nordfriese Could you please have a look at the changes to base/multithreading.cc, whether they can stay?

This doesn't solve all deadlocks though, as the last few testsuite failures show, nor #6327.

bunnybot avatar Feb 12 '24 14:02 bunnybot

tothxaMirrored from Codeberg On Tue Feb 13 21:45:31 CET 2024, Tóth András (tothxa) wrote:


Why do you need this change?

To resolve this deadlock: https://github.com/widelands/widelands/actions/runs/7886026322/job/21518487518

And the kLua lock was needed to resolve this double free: https://github.com/widelands/widelands/actions/runs/7873658670/job/21481637915

There are 4 other options I see:

  1. Lock kLua before kObjetcs for Panel::think() just for the possibility that there may be a plugin timer: overkill, probably pretty much disables parallel execution
  2. Lock kObjects before all lua calls (along with, and just before kLua) — maybe slightly better, and marginally reasonable
  3. The actual solution, but probably huge work, so not for v1.2: Don't lock kObjects at all in ui_basic/panel.cc, find the actual places where this lock is needed instead
  4. Not guaranteed to work: We can test whether the regression was caused by GH6139 or some other change in naval warfare, and revert it if so or if found. But I don't have energy for any more debugging after 3 days of fighting this.

So I thought this was simple enough, and still think this would be the best option if we still want to release v1.2 in a foreseeable time. OK, option 2 above may worth a try too if you prefer that.

bunnybot avatar Feb 13 '24 20:02 bunnybot

NordfrieseMirrored from Codeberg On Tue Feb 13 22:08:44 CET 2024, Benedikt Straub (Nordfriese) wrote:


The new kLua locks look good, in fact I wonder if we should add this in master anyway as a separate commit. I once encountered a non-reproducible crash while working on my plugin that I suspect might be fixed by this.


I don't see how 6139 could have caused an issue here, all it does is to make deadlocks more visible and ensure that notes are not overlooked by the thread meant to handle them.

Option 3 would be the way to go. But since this means a risk of new hard-to-find race conditions that's definitely not for v1.2.

If you want to investigate if 2 would fix the problems while not impacting performance then we could have that as a workaround in v1.2, so long as we change it shortly after the release. If this workaround isn't easily doable, or has too much of an impact, it might be safer to postpone this PR for v1.3.

bunnybot avatar Feb 13 '24 21:02 bunnybot

tothxaMirrored from Codeberg On Tue Feb 13 22:23:23 CET 2024, Tóth András (tothxa) wrote:


If this workaround isn't easily doable, or has too much of an impact, it might be safer to postpone this PR for v1.3.

That's actually just shooting the messenger. The only C++ change in this PR before the regression introduced by the naval warfare merge was adding the Lua interface for PlayerEndStatus. (which is not involved in the memory violations / deadlocks) This is basically just about adding more tests, and, since the naval merge, fixing the regressions those tests found.

OK, I'll do 2. (no big deal, kLua is only locked in very few places), and let's see the testsuite. (Unfortunately I can't reproduce the races locally.)

bunnybot avatar Feb 13 '24 21:02 bunnybot

tothxaMirrored from Codeberg On Wed Feb 14 09:07:23 CET 2024, Tóth András (tothxa) wrote:


Option 2 passed the testsuite, and the overall time was (very slightly, but observably) better than the previous approach. I've pushed a tweak to the tests, so let's see the result of another run.

bunnybot avatar Feb 14 '24 08:02 bunnybot

tothxaMirrored from Codeberg On Wed Feb 14 10:55:41 CET 2024, Tóth András (tothxa) wrote:


The nullptr deref in sample_statistics() happened again in the second run. This was a rare alternative manifestation of the double free before. I'm not sure if the plugin timer in logic approach really made it go away or just was lucky (0 problems in 3 runs). Anyway, I've triggered a re-run, let's see how many errors we get in 3 runs after 1 in 2.

Update: The (full) re-run completed without errors, so it's 1 errors in 3 complete runs, while it had been 2 or 3 errors on average in every run before. However I've found another important entry point that was still unprotected, so I added locking for it, and then some more.

bunnybot avatar Feb 14 '24 09:02 bunnybot

tothxaMirrored from Codeberg On Wed Feb 14 20:24:35 CET 2024, Tóth András (tothxa) wrote:


Too bad #6154 got merged so late. I wanted to make the .wgt extension in this PR "official" (as in make widelands xxx.wgt equivalent of --new_game_from_template) after it. Now that's left for v1.3.

bunnybot avatar Feb 14 '24 19:02 bunnybot

tothxaMirrored from Codeberg On Thu Feb 15 14:11:44 CET 2024, Tóth András (tothxa) wrote:


<@>bunnybot update

bunnybot avatar Feb 15 '24 13:02 bunnybot

tothxaMirrored from Codeberg On Thu Feb 15 22:35:57 CET 2024, Tóth András (tothxa) wrote:


Isn't this the situation that GH6139 was supposed to solve? Also why don't we get a log of the deadlock?

I don't know if it's possible, but I think LuaPlayer::message_box (and any other lua functions that use NoteThreadSafeFunction::instantiate too) should make their coroutine yield, release locks for the duration of the UI thread call, then when the call returns, reacquire locks and resume the coroutine. Can you help implementing that?

bunnybot avatar Feb 15 '24 21:02 bunnybot

LuaPlayer::message_box (and any other lua functions that use NoteThreadSafeFunction::instantiate too)

Good news(?), message box is the only one.

tothxa avatar Feb 15 '24 21:02 tothxa

NordfrieseMirrored from Codeberg On Thu Feb 15 19:10:54 CET 2024, Benedikt Straub (Nordfriese) wrote:


Code looks OK now. However if I run the first tutorial Widelands deadlocks immediately before the first message box is shown.

GDB capture, release build
(gdb) bt
<#>0  futex_wait (private=0, expected=2, futex_word=0x5555560e11a0 <_ZL26responsiveness_list_mutex_.lto_priv.0>) at ../sysdeps/nptl/futex-internal.h:146
#1  __GI___lll_lock_wait (futex=futex@entry=0x5555560e11a0 <_ZL26responsiveness_list_mutex_.lto_priv.0>, private=0) at ./nptl/lowlevellock.c:49
#2  0x00007ffff74a96f2 in lll_mutex_lock_optimized (mutex=0x5555560e11a0 <_ZL26responsiveness_list_mutex_.lto_priv.0>) at ./nptl/pthread_mutex_lock.c:48
#3  ___pthread_mutex_lock (mutex=0x5555560e11a0 <_ZL26responsiveness_list_mutex_.lto_priv.0>) at ./nptl/pthread_mutex_lock.c:93
#4  0x0000555555d35180 in MutexLock::push_stay_responsive_function(std::function<void ()>) ()
#5  0x00005555559a2a06 in UI::Panel::ModalGuard::ModalGuard(UI::Panel&) ()
#6  0x00005555559a3adf in UI::Panel::do_run() ()
#7  0x0000555555d326f3 in std::_Function_handler<void (), NoteThreadSafeFunction::instantiate(std::function<void ()> const&, bool, bool)::{lambda()#1}>::_M_invoke(std::_Any_data const&) ()
#8  0x00005555559a286a in UI::Panel::handle_notes() ()
#9  0x0000555555d3d544 in MutexLock::MutexLock(MutexLock::ID) ()
#10 0x00005555559a3e36 in UI::Panel::do_run() ()
#11 0x0000555555909dae in Widelands::Game::run(Widelands::Game::StartGameType, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) ()
#12 0x000055555590ade1 in Widelands::Game::run_splayer_scenario_direct(std::__cxx11::list<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) ()
#13 0x00005555559e9d4b in FsMenu::ScenarioSelect::clicked_ok() ()
#14 0x0000555555916bc6 in Notifications::Signal<unsigned int>::operator()(unsigned int) const ()
#15 0x00005555559ac7d7 in UI::Table<void*>::handle_key(bool, SDL_Keysym) ()
#16 0x000055555599d3d3 in UI::Panel::do_key(bool, SDL_Keysym) ()
#17 0x000055555599d337 in UI::Panel::do_key(bool, SDL_Keysym) ()
#18 0x000055555599d337 in UI::Panel::do_key(bool, SDL_Keysym) ()
#19 0x000055555599d337 in UI::Panel::do_key(bool, SDL_Keysym) ()
#20 0x000055555599d337 in UI::Panel::do_key(bool, SDL_Keysym) ()
#21 0x000055555599d337 in UI::Panel::do_key(bool, SDL_Keysym) ()
#22 0x000055555599d337 in UI::Panel::do_key(bool, SDL_Keysym) ()
#23 0x000055555599d337 in UI::Panel::do_key(bool, SDL_Keysym) ()
#24 0x000055555599d337 in UI::Panel::do_key(bool, SDL_Keysym) ()
#25 0x0000555555846b02 in WLApplication::handle_input(InputCallback const*) ()
#26 0x00005555559a4120 in UI::Panel::do_run() ()
#27 0x00005555559c7bba in FsMenu::MainMenu::main_loop() ()
#28 0x000055555584cba0 in WLApplication::run() ()
#29 0x000055555578a159 in main ()
(gdb) thread 17
[Switching to thread 17 (Thread 0x7fffce1fc6c0 (LWP 41228))]
<#>0  0x00007ffff74ecaa5 in __GI___clock_nanosleep (clock_id=clock_id@entry=0, flags=flags@entry=0, req=0x7fffce1fb660, rem=0x7fffce1fb650) at ../sysdeps/unix/sysv/linux/clock_nanosleep.c:48
48	../sysdeps/unix/sysv/linux/clock_nanosleep.c: Datei oder Verzeichnis nicht gefunden.
(gdb) bt
<#>0  0x00007ffff74ecaa5 in __GI___clock_nanosleep (clock_id=clock_id@entry=0, flags=flags@entry=0, req=0x7fffce1fb660, rem=0x7fffce1fb650) at ../sysdeps/unix/sysv/linux/clock_nanosleep.c:48
#1  0x00007ffff74f1473 in __GI___nanosleep (req=<optimized out>, rem=<optimized out>) at ../sysdeps/unix/sysv/linux/nanosleep.c:25
#2  0x00007ffff7b6ea51 in  () at /lib/x86_64-linux-gnu/libSDL2-2.0.so.0
#3  0x0000555555d3e21a in NoteThreadSafeFunction::instantiate(std::function<void ()> const&, bool, bool) ()
#4  0x0000555555c54537 in LuaGame::LuaPlayer::message_box(lua_State*) ()
#5  0x0000555555dbd0f7 in luaD_precall ()
#6  0x0000555555dcff34 in luaV_execute ()
#7  0x0000555555dbd39c in resume ()
#8  0x0000555555dbc82c in luaD_rawrunprotected ()
#9  0x0000555555dbd5e6 in lua_resume ()
#10 0x0000555555cecef1 in LuaCoroutine::resume() ()
#11 0x0000555555c35d43 in Widelands::CmdLuaCoroutine::execute(Widelands::Game&) ()
#12 0x0000555555c3d3d3 in Widelands::CmdQueue::run_queue(Duration const&, Time&) ()
#13 0x0000555555905fb0 in Widelands::Game::think() ()
#14 0x00005555559a262b in UI::Panel::logic_thread() ()
#15 0x00007ffff76dee64 in  () at /lib/x86_64-linux-gnu/libstdc++.so.6
#16 0x00007ffff74a645c in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:444
#17 0x00007ffff7526bbc in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81

bunnybot avatar Feb 15 '24 22:02 bunnybot

I don't know if it's possible, but I think LuaPlayer::message_box (and any other lua functions that use NoteThreadSafeFunction::instantiate too) should make their coroutine yield, release locks for the duration of the UI thread call, then when the call returns, reacquire locks and resume the coroutine.

Oops, that doesn't seem to be possible... The lock is held at the original lua entry point, so a lua function can't do anything about it, can it? Yield is probably also impossible without returning to the original entry point...

tothxa avatar Feb 15 '24 22:02 tothxa