Add a test in the CI that launches and closes the standalone server, checking for error logs
Split out from #1097
@IntegratedQuantum To run the server, it'll need a world.
I think it should generate that world itself rather than obtaining a premade world from the web. I'll create that sub-issue for further discussion
Recent events have shown that this is quite important, we should get this as soon as possible.
I'm thinking about it a bit, I think we want two tests:
- Open existing world created from the last release, check log for errors.
- Open new world created from the commit that triggered the CI, check log for errors. This way we can catch both migration-type errors as well as errors generally launching the game on the new version.
1 is a bit trickier, especially since 0.0.1 doesn't support headless.
But we can provide it with a world created on 0.0.1 until 0.1.0 releases w/ headless.
Then, going forward from 0.1.0, it can pull 0.1.0 and run/build from that tag to create its own world for testing. Then whenever we release a new build, we update the CI to point to the new tag.
It would be better to create a curated world, with some stuff in it to deliberately trigger certain actions (e.g. LOD regeneration)
This way we can catch both migration-type errors as well as errors generally launching the game on the new version.
Also a controlled shutdown would be nice, to be able to catch leaks.
It would be better to create a curated world
Are we going to require that contributors update the curated world?
I thought that we could trigger LOD regeneration, for example, simply by loading a 0.0.1 world in a build of the latest commit. Which could also be useful for catching errors caused when users try to load their existing worlds in a new build.
Also a controlled shutdown would be nice, to be able to catch leaks.
Agreed. I think either a launchConfig option or a flag (--ci or something) can instruct the game to launch the server, run one update cycle, then exit cleanly.
Are we going to require that contributors update the curated world?
Yes, and that's fine, since it's not a problem to be compatible to older worlds than the target. Furthermore if we only update the world on breaking changes, then it would be a good way to track when an world actually becomes no longer loadable.
I thought that we could trigger LOD regeneration, for example, simply by loading a 0.0.1 world in a build of the latest commit.
Yes, but if there is no terrain stored, then it won't actually do anything.
Agreed. I think either a launchConfig option or a flag
I think a launch config that is time based would make the most sense. In the future we may want longer tests that e.g. connect a dummy to the server. Furthermore a timer like this could also be useful for normal hosts to e.g. regularly take the server down to run a backup routine, or to simply have a time-limited server.
I don't think controlling runtime using a flag and from inside is that good of an idea. It's very limited and won't scale to more complicated purposes. We will be stuck adding new arguments every time. Additionally time is meaningless, on one hardware few milliseconds will be enough to enter the world and tick n times, on other it won't even manage to load the assets. For non testing purposes that granularity is useless as humans are not functioning at that scale. People will run the server using some external restart logic anyway because it may panic and then it has to be triggered externally.
A better approach would be to create a communication protocol for server control, possibly with exchangeable backends (stdin/stdout and socket) and add one set of dedicated flags for making the server expect socket communication. Then a server could communicate that it reached certain execution stage and the controller could (gratefully) terminate it issuing an appropriate command. That could be accompanied by debug commands reporting status of different systems. The intent is for controller to utilize same command implementations as in-game commands issued by players.
As for detecting that we are in CI there already exist a well known way - we should query environment variable CI
Yes, that is a good point. A command through stdin would be nicer than a timer.
Yes, that is a good point. A command through stdin would be nicer than a timer.
Can you elaborate on what the command should do?
Argmaster's description is very broad. It sounds like we'll need a calling program separate from Cubyz that can do this socket communication thing and can receive updates from the server about what it's executing and kill it after it reaches a desired point.
My best guess for now:
- Create a new program that can call Cubyz and read/write a socket that the Cubyz server utilizes as well in CI.
- The new program calls Cubyz with a flag to run the server in "CI" mode. It would use the socket to tell the calling program about where it is in execution.
- The calling program waits until a certain communication has been given, then it issues a command to update
runningto false, stopping the server gracefully
There's still a lot of unknowns to me.
- What are the specifics of this communication protocol?
- Do we need to support windows sockets or just unix sockets?
- What execution steps do we communicate to the controller?
- What execution step do we want the CI to kill the server when it reaches?
- Do we build the calling program in a separate repo and clone it using a script during the workflow?
Probably more questions once we have the answers to those.
Don't focus too much on the socket part, for now stdin/stdout is the way to go. The socket part could then even be implemented by a third party, I personally think that would be preferable anyways (such programs probably already exist).
It sounds like we'll need a calling program separate
Yes, to test the game we do need another program (which we will need anyways to check the output and look for error logs and to prepare test cases by setting up the launchConfig and all that)
What are the specifics of this communication protocol?
Honestly std.log.info and std.log.debug are probably good enough. We should probably have more of these anyways to debug other issues.
What execution step do we want the CI to kill the server when it reaches?
For this first testing server I'd say we can kill it right after it starts the main loop. In the future we will probably need other tests that also spawn fake players and test some simulation stuff, but let's not get ahead of ourselves here.
Do we build the calling program in a separate repo and clone it using a script during the workflow?
No, this would cause too much trouble when updating Zig versions and an general syncing other changes. All tests belong into the main repo.
Calling program - professional therm is controller - should be a Python script. Been there done that, it's easy to create and extend, trivial to load into CI environment basically lowest difficulty threshold you can get.