minGPT-tuned
minGPT-tuned copied to clipboard
Question about model tuning
Hi,
Thanks for making your enhancements to minGPT available. I am curious why your play_math model will deal with ndigit = 4 when anything above ndigit = 3 with stock minGPT results in SIGKILL.
Ultimately I am trying to train on much longer addition sequences, for the purpose of multiplication eventually. But it looks like whenever the sequence length > 4:
File "/home/asdf/minGPT-tuned/play_math.py", line 108, in <module>
train_dataset = AdditionDataset(ndigit=ndigit, split='train')
File "/home/asdf/minGPT-tuned/play_math.py", line 80, in __init__
perm = r.permutation(num)
File "mtrand.pyx", line 4528, in numpy.random.mtrand.RandomState.permutation
MemoryError: Unable to allocate 74.5 GiB for an array with shape (10000000000,) and data type int64
From here:
perm = r.permutation(num)
Any thoughts on model tuning to support say 24-bit addition sequences?
Try a better DataLoader that dynamically generates the AdditionDataset (instead of building the full dataset in init).