KataGo
                                
                                 KataGo copied to clipboard
                                
                                    KataGo copied to clipboard
                            
                            
                            
                        How is the b18-network stronger than let's say b40?
I am currently using an older version of KataGo, and when running larger networks such as b40 or b60, they are clearly stronger and more reliable at high playouts than smaller networks such as b10, b15 or b20.
I saw on katagotraining.org that the b18 network is the strongest for newer KataGo. Is this really the case? And if so, does this mean that KataGo has become much stronger in relation to used computing power, as b18 is faster than b60?
The b18 network uses a much better architecture. Each block of the b18 has many more layers, so in many ways it is similar to a 36 block network of the old architecture. However, the number of channels within the blocks is also different, due to the use of bottleneck layers. It also uses a more expensive activation function, which offsets part of the speedup that you would gain from having fewer blocks and channels, but which produces much better results in training.
All the parameters are chosen so that overall, on most hardware, the compute cost of the new architecture b18 should be similar to the old 40-block networks, but per amount of compute used, the new architecture seems to much better able to fit the data, so yes it is the new strongest net.
There may be positions where the 60 block net has a slightly better long-term evaluation, particularly in perceiving whole-board-sized semeai or dragons, due to being a deeper net despite being a worse architecture, but there are probably also cases where the new 18 block nets can outread the 60 block net in tactics, particularly once you take into account given that on most hardware the b18 net is roughly twice as fast to run and so will get double the playouts.
@lightvector
There may be positions where the 60 block net has a slightly better long-term evaluation, particularly in perceiving whole-board-sized semeai or dragons, due to being a deeper net despite being a worse architecture,
This has a significant negative impact on the analysis. #816 It is a very simple, yet huge in size, race( with a ko). Even the currently strongest 18-block network has not solved it at all. The number of moves is not that large, and even amateurs can reach the correct conclusion for this level of race.
@lightvector
I have confirmed that the above issue(https://github.com/lightvector/KataGo/issues/816) has been correctly resolved on the latest and strongest network (kata1-b18c384nbt-s8493331456-d3920571699) at this time. (However, the settings have been changed to get the most decent solution in my environment.)
Cool, thanks for the report. :)