minGPT-tuned Question about model tuning

Question about model tuning

Open pablogranolabar opened this issue 4 years ago • 1 comments

Hi,

Thanks for making your enhancements to minGPT available. I am curious why your play_math model will deal with ndigit = 4 when anything above ndigit = 3 with stock minGPT results in SIGKILL.

Ultimately I am trying to train on much longer addition sequences, for the purpose of multiplication eventually. But it looks like whenever the sequence length > 4:

  File "/home/asdf/minGPT-tuned/play_math.py", line 108, in <module>
    train_dataset = AdditionDataset(ndigit=ndigit, split='train')
  File "/home/asdf/minGPT-tuned/play_math.py", line 80, in __init__
    perm = r.permutation(num)
  File "mtrand.pyx", line 4528, in numpy.random.mtrand.RandomState.permutation
MemoryError: Unable to allocate 74.5 GiB for an array with shape (10000000000,) and data type int64

From here:

perm = r.permutation(num)

Any thoughts on model tuning to support say 24-bit addition sequences?

Jan 05 '21 15:01 pablogranolabar

Try a better DataLoader that dynamically generates the AdditionDataset (instead of building the full dataset in init).

Jul 29 '21 04:07 BlinkDL

minGPT-tuned minGPT-tuned copied to clipboard

Question about model tuning

minGPT-tuned
minGPT-tuned copied to clipboard