lczero_tools icon indicating copy to clipboard operation
lczero_tools copied to clipboard

Can not read weights.pb files

Open Muffty opened this issue 5 years ago • 11 comments

Hi, I try to use the downloaded weights file (I got using download_latest_network.py). It is not .txt.gz but .pb.gz and read_weights_file does not seem to be okay with that.

Muffty avatar Mar 01 '19 14:03 Muffty

@so-much-meta can you advise here? Very eager to continue progress on this project, but this seems to be a hard blocker. Happy to help update any code if you can point us in the right direction. Thanks!

advait avatar Apr 24 '19 01:04 advait

Sorry, it’s been a while since I’ve looked at this project. I’ll take a look this evening and respond, and if I can get some time soon will try to make some updates (eg, I know a big need is support for se-resnets).

On Tue, Apr 23, 2019 at 9:00 PM Advait Shinde [email protected] wrote:

@so-much-meta https://github.com/so-much-meta can you advise here? Very eager to continue progress on this project, but this seems to be a hard blocker. Happy to help update any code if you can point us in the right direction. Thanks!

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/so-much-meta/lczero_tools/issues/8#issuecomment-486027298, or mute the thread https://github.com/notifications/unsubscribe-auth/AJGXKCLI23ISADXWYVME6C3PR6WLDANCNFSM4G3EXCJA .

so-much-meta avatar Apr 24 '19 21:04 so-much-meta

@so-much-meta could you provide some help with how to work with the .pb files which are currently available in LC0 page ?

I tried renaming that to .txt but then it throws a size mismatch error -

line 153, in from_weights_file param.data.copy_(w.view_as(param)) RuntimeError: shape '[8, 112, 3, 3]' is invalid for input of size 8640

Sumegh-git avatar Jun 10 '20 16:06 Sumegh-git

You can convert the pb.gz to a txt.gz file using the save_txt function of the net class but this won't fix the issue because the nets on the website are a newer version and aren't compatible. You can try to make it compatible but I found it easier to do this:

Download the model and config yaml files from here :

https://www.comp.nus.edu.sg/~sergio-v/new/128x10-t60-2/

Load the model using the tfprocess module

` import yaml import tfprocess

class KerasNet:

def __init__(self, model_file="128x10-t60-2-5300.pb.gz", cfg_file="configs/128x10-t60-2.yaml"):

    with open(cfg_file, "rb") as f:
        cfg = f.read()

    cfg = yaml.safe_load(cfg)
    print(yaml.dump(cfg, default_flow_style=False))

    tfp = tfprocess.TFProcess(cfg)
    tfp.init_net_v2()
    tfp.replace_weights_v2(model_file)

    self.model = tfp.model 

def evaluate(self, leela_board):

    input_planes = leela_board.lcz_features()
    model_input = input_planes.reshape(1, 112, 64)
    policy, value, _ = self.model.predict(model_input)

    return policy, value

`

ZararB avatar Jan 16 '21 07:01 ZararB

@ZararB Thanks, this was very helpful!

Do you happen to have a solution to read current training data? Using TarTrainingFile yield following error

Exception: Only version 3 of training data supported

lzanini avatar Jan 25 '21 21:01 lzanini

@lzanini Sorry, I am only using the network for evaluation and haven't looked into training

ZararB avatar Jan 26 '21 00:01 ZararB

@ZararB thank you for your answer.

Did you manage to get correct evaluations using lczero-training network with inputs computed by this library?

I get the network to run, but output values are surprising to say the least, and very different from the ones I get with the engine limited to nodes=1. Which makes me wonder if the inputs are still valid for the latest networks.

lzanini avatar Jan 26 '21 14:01 lzanini

@lzanini yeah the code I wrote before was a bit misleading. The policy network output needs to be run through the softmax function to get probabilities. Here is the full thing:

` import numpy as np import yaml import tfprocess from collections import OrderedDict

class KerasNet:

def __init__(self, model_file="128x10-t60-2-5300.pb.gz", cfg_file="configs/example.yaml"):

    with open(cfg_file, "rb") as f:
        cfg = f.read()

    cfg = yaml.safe_load(cfg)
    print(yaml.dump(cfg, default_flow_style=False))

    tfp = tfprocess.TFProcess(cfg, gpu=True)
    tfp.init_net_v2()
    tfp.replace_weights_v2(model_file)
    self.model = tfp.model 

def _softmax(self, x, softmax_temp=1.0):
    e_x = np.exp((x - np.max(x))/softmax_temp)
    return e_x / e_x.sum(axis=0)

def _evaluate(self, leela_board):

    input_planes = leela_board.lcz_features()
    model_input = input_planes.reshape(1, 112, 64)
    model_output = self.model.predict(model_input)

    policy_logits = model_output[0][0]
    
    legal_uci = [m.uci() for m in leela_board.generate_legal_moves()]
    
    if legal_uci:
        legal_indexes = leela_board.lcz_uci_to_idx(legal_uci)
        softmaxed = self._softmax(policy_logits[legal_indexes])
        softmaxed_aspython = map(float, softmaxed)
        policy_legal = OrderedDict(sorted(zip(legal_uci, softmaxed_aspython),
                                    key = lambda mp: (mp[1], mp[0]),
                                    reverse=True))

    else:
        policy_legal = OrderedDict()

    return policy_legal

`

I copied most of this from the lcztools code and just modified it a little so it worked with mine

ZararB avatar Jan 26 '21 23:01 ZararB

@ZararB Where does the term x - np.max(x) come from in your softmax definition?

The usual definition (and the one defined in tensorflow) is simply

softmax = tf.exp(logits) / tf.reduce_sum(tf.exp(logits), axis)

Since the training script uses the standard tf.nn.softmax_cross_entropy_with_logits directly on the model output (here and here), I don't see why the values need to be normalized before going through the softmax

lzanini avatar Jan 27 '21 00:01 lzanini

@lzanini Both definitions are equivalent. The max is usually subtracted from all the logits so that we can work with smaller numbers and to prevent NaN issues. You can use either one

ZararB avatar Jan 27 '21 01:01 ZararB

You're right! Thanks again 👍

lzanini avatar Jan 27 '21 01:01 lzanini