go_attack KataGo misconfiguration invalidates the main result.

Your config file for KataGo is not setting friendlyPassOk: false option, and therefore KataGo rules are not set to be Tromp-Taylor. KataGo will perform "friendly early pass", which is what you report in your paper.

Tromp-Taylor configuration is prescribed here: https://github.com/lightvector/KataGo/blob/master/docs/GTP_Extensions.md

To summarize, your bot and you judging code is working with Tromp-Taylor rules while KataGo is not.

This misconfiguration is the root cause why your network is able to exploit KataGo. I'm sorry that this invalidates the main result of your paper.

Notably there was a case where a human player exploited the rules in similar way.

Nov 08 '22 04:11 lukaszlew

Hi Lukas,

My understanding is that friendlyPassOk is only an option for GTP and analysis. We evaluate using match, not GTP, so I don't think it's even a supported flag there. Which makes sense -- friendlyPassOk is a flag designed for play with humans, not KataGo-vs-KataGo.

There is one case this could cause problems -- our baseline attacks (hard-coded exploits) did use GTP. But I think friendlyPassOk defaults to false if omitted? https://github.com/HumanCompatibleAI/KataGo-custom/blob/4891d1630699a8081840b4e50c2ce63f397260fa/cpp/game/rules.h#L38 I don't see anywhere that would set it to True by default.

This misconfiguration is the root cause why your network is able to exploit KataGo. I'm sorry that this invalidates the main result of your paper.

I appreciate the bug report, but this seems like a bit hasty of a claim ;) Did you take the time to understand our evaluation code? Or check what the default GTP flag was?

There are a lot of small implementation details that can matter, so I always appreciate a 2nd pair of eyes, but we did do a lot of sanity checking before putting this work out there...

Nov 08 '22 05:11 AdamGleave

I read the Ars Technica article about your work with great interest! I have been playing Go for 30 years, am around 1-dan in strength, and watch tournament games frequently. I have spent many hours getting trashed by KataGo and was so excited that you had found a way to defeat the beast by analyzing its neural network, because how cool would that be!

Unfortunately... your result is invalid. Lukas has the right idea. KataGo is passing assuming that all black stones in its gigantic area are dead. Which, they are. That is, KataGo is simply assuming (like all serious human players) that Tromp-Taylor's amendment is in effect, i.e., from https://tromp.github.io/go.html: "As a practical shortcut, the following amendment allows dead stone removal: After only 2 consecutive passes, the players may end the game by [agreeing on which points to empty (https://tromp.github.io/agree.html)." KataGo is assuming your program (or human judges) would be reasonable and agree that all the black stones in white's area are dead. Any human player would make the same assumption, and pass at around the same point that KataGo passes, for the same reason: we can all see the black stones (in white's 75% of the board) are dead. In literally no serious human game would white bother spending dozens more moves to go around collecting all those dead black stones in white's territory. The requirement to do so in the formal rules is just that: a formality, which never applies in practice, because human players (or computer players, or match judges) can almost* always easily determine the life and death of stones at the end of the game. You can also beat Ke Jie or JungHwan Park or anyone else if you surprise them with a change in rules like this.

This result has nothing to do with blind spots in the neural network, and everything to do with a different assumption about which rules are in effect. The link Lukas provided (https://senseis.xmp.net/?DisputeMeroJasiek) is instructive: Jasiek tried to swindle his opponent the same way your program is swindling KataGo, and got the expected result, which is that the judges ruled that having a bunch of dead stones in your opponent's territory does not make you the winner.

I'm sorry to say it (I'm honestly disappointed), but your result is invalid. Your claimed victories over KataGo are actually not victories at all. KataGo crushed your program under the rule set it was assuming, as would be expected.

Does anyone on your team play tournament go? At what level? Because I feel your team misunderstands how Go is played in practice in a way that any dan-level player would recognize immediately.

Kenneth Duda Menlo Park, CA 956-433-3339 [email protected]

There are artificial cases involving piles of sekis and kos where determining the liveness of stones is legitimately hard, but your example games are not among them.

Nov 08 '22 15:11 kduda

While I cannot give an advice on Katago settings, like others mentioned, the result of the paper is invalid. I play 6-7dan on Fox server, and all the games on your paper are won by KataGo(victim) for sure. I assume your scoring system is the problem, probably incapable of detecting which stones should be considered as dead even without being captured.

Nov 08 '22 15:11 ChaozR

Hi all,

Thanks for your feedback. We're well aware these games are only won under computer Go scoring rules, not human play. We should have made this clearer in the paper and will be adding an appendix shortly to clarify the rule set and evaluation setting. But the result is not "invalid" as you put it: we win under the rule set KataGo was configured to use and which it was trained on. This is the evaluation setting we'd expect KataGo to be strongest on.

I discuss the rules used in detail in https://www.reddit.com/r/MachineLearning/comments/yjryrd/comment/iuprp1z/ in response to a similar concern. In summary, although KataGo can be configured to have players select dead stones, this is only used when playing with humans. Indeed it wouldn't make much sense during training -- KataGo initially doesn't know which stones are dead or alive so there'd be a bootstrapping problem. KataGo does implement a rudimentary form of dead stone removal using Benson's algorithm even in computer Go, but in our games the adversary stones are not probably dead (under any sequence of legal moves) so do not get removed by this.

I am however sympathetic to a variant of this critique, namely that we're exploiting an edge case, and one that's fairly easy to patch -- just change KataGo to never end the game early before passing. We report on exactly that in the paper, and it does defeat this specific adversarial policy. However, we have since found that if we just repeat the attack it finds a new adversarial policy that doesn't rely on passing. Although this wasn't our aim, this new adversary also wins under standard Chinese/Japanese rules too. We should have an updated version of the preprint up in a few weeks.

I'll close this issue in a couple of days unless there's any new information here that would suggest this is a real problem.

On Tue, 8 Nov 2022, 07:47 ChaozR, @.***> wrote:

While I cannot give an advice on Katago settings, like others mentioned, the result of the paper is invalid. I play 6-7dan on Fox server, and all the games on your paper are won by KataGo(victim) for sure. I assume your scoring system is the problem, probably incapable of detecting which stones should be considered as dead even without being captured.

— Reply to this email directly, view it on GitHub https://github.com/HumanCompatibleAI/go_attack/issues/55#issuecomment-1307427447, or unsubscribe https://github.com/notifications/unsubscribe-auth/AALZ3IYRCWFVQ6NCMEQEKD3WHJYXPANCNFSM6AAAAAARZ4UIQY . You are receiving this because you commented.Message ID: @.***>

Nov 08 '22 17:11 AdamGleave

Thanks Adam for being so explicit with your explanations.

I'm jumping to conclusion based around friendlyPassOk: false because this is exactly what happens in all your games. KataGo is (friendly) passing early.

Nov 08 '22 17:11 lukaszlew

I'm jumping to conclusion based around friendlyPassOk: false because this is exactly what happens in all your games. KataGo is (friendly) passing early.

Understandable, it does look similar, but this is also exactly what would happen if the adversary was tricking the victim into making a bad move.

We're doing some interpretability on the victim network and search to better understand why this occurs. So far I can say the victim policy prior assigns high probability (typically >20%) to passing, so it really does thinks it's the best (or one of the best) moves. By contrast, friendlyPassOk is extra, hard-coded logic that's not part of network at all.

Nov 08 '22 18:11 AdamGleave

Thanks for the explanations.

we win under the rule set KataGo was configured to use and which it was trained on. This is the evaluation setting we'd expect KataGo to be strongest on.

I agree that the strategy can interrupt AI training.

but in our games the adversary stones are not probably dead (under any sequence of legal moves) so do not get removed by this.

But this is still not true, I looked into the games again, and I am definitely sure those stones counted as 'not dead yet' cannot survive with every sequence of legal move. While the adversarial policy still can decide to invade other vast area of victim's territory, but the victim would have responded then. This leads to the pass of the victim as the pass is indeed a best move in a human sense.

I would rather see the result suggests computer go or KataGo scoring system can be improved, and I recommend your team to find someone who can review the game on following research.

I will look forward the result with Chinese/Japanese rules :)

Nov 09 '22 00:11 ChaozR

I think Human Evaluation couldn't emulate adversarial AI. I could beat KataGo with friendlyPassOk=false Tromp-Taylor rule by following style.

Make small area alive
Keep self stones in KataGo's moyo weak to prevent KataGo network decide these stones is in capturing race
Don't make space after picking up stones, to prevent KataGo's area from becoming pass-alive

https://gokifu.net/t2.php?s=1761667966039857

Nov 09 '22 04:11 zakki

it will be prevented, and play some move adjacent to a dead stone instead.

This logic works only if live stones is "strictly pass-alive", but there are few pass-alive stone in early stage of game. It seems that example games don't contains strictly pass-alive stones.

https://github.com/lightvector/KataGo/blob/42892ea19b4256803ba7a0ff18d1096a84d11fe6/cpp/program/playutils.cpp#L1044 https://github.com/lightvector/KataGo/blob/42892ea19b4256803ba7a0ff18d1096a84d11fe6/cpp/search/search.cpp#L1007

Nov 10 '22 00:11 zakki