nanoGPT
nanoGPT copied to clipboard
Configurator will work correctly with default None values
Previously if the value in train.py file had default None value (or try to override with None), the configurator failed to override this value with a new one because of the type mismatch.
Now if default value is None or replacing value is None -> skip type check.
For example in another PR among other changes I propose to have head_size
for CausalSelfAttention
class, and if this value is None, then head_size = n_embd // n_head (default behavior).
And the logic is to have this value in train.py
fail as None by default, so if needed to override it in the CLI. But with the current state of configurator.py assertion will fail because of the type mismatch: NoneType != int type
Another example: top_k
argument for CausalSelfAteention.generate
method.
if top_k is not None:
v, _ = torch.topk(logits, min(top_k, logits.size(-1)))
logits[logits < v[:, [-1]]] = -float('Inf')
and on mps you might want to do this (at least on PyTorch==1.13.1), but you cannot because of the type mismatch: was int and trying to replace with NoneType.
Bottom line: type check should be done when overridable and overriding values are not None. That makes the algorithm less defensive but doesn't limit functionality.
Oops 😺 Not only being guilty for "works on my machine" approach, but also left not necessary import (from another approach). I guess I am too used to various linters that I turned off here. Thanks @chrisociepa for spotting this 👍