nanoGPT Configurator will work correctly with default None values

Configurator will work correctly with default None values

Open Andrei-Aksionov opened this issue 2 years ago • 1 comments

Previously if the value in train.py file had default None value (or try to override with None), the configurator failed to override this value with a new one because of the type mismatch.

Now if default value is None or replacing value is None -> skip type check.

For example in another PR among other changes I propose to have head_size for CausalSelfAttention class, and if this value is None, then head_size = n_embd // n_head (default behavior). And the logic is to have this value in train.py fail as None by default, so if needed to override it in the CLI. But with the current state of configurator.py assertion will fail because of the type mismatch: NoneType != int type

Another example: top_k argument for CausalSelfAteention.generate method.

if top_k is not None:
    v, _ = torch.topk(logits, min(top_k, logits.size(-1)))
    logits[logits < v[:, [-1]]] = -float('Inf')

and on mps you might want to do this (at least on PyTorch==1.13.1), but you cannot because of the type mismatch: was int and trying to replace with NoneType.

Bottom line: type check should be done when overridable and overriding values are not None. That makes the algorithm less defensive but doesn't limit functionality.

Feb 16 '23 11:02 Andrei-Aksionov

Oops 😺 Not only being guilty for "works on my machine" approach, but also left not necessary import (from another approach). I guess I am too used to various linters that I turned off here. Thanks @chrisociepa for spotting this 👍

Feb 24 '23 13:02 Andrei-Aksionov

nanoGPT nanoGPT copied to clipboard

Configurator will work correctly with default None values

nanoGPT
nanoGPT copied to clipboard