nanoGPT Training on "Shakespeare" dataset is faster by using MacBook Air (M2)

Training on "Shakespeare" dataset is faster by using MacBook Air (M2)

Open xiningnlp opened this issue 2 years ago • 7 comments

When trained the model on MacBook Air (M2) using Shakespeare (with Mac's 'mps' device and 8G mem), it takes me only around 120ms per iteration, with the same configuration of the author using his MBA (M1)

Jan 27 '23 15:01 xiningnlp

Macbook Air M2 16GB gives ~300ms with mps using Shakespeare:

iter 23: loss 10.7863, time 305.73ms
iter 24: loss 10.8073, time 298.21ms
iter 25: loss 10.7988, time 308.87ms
iter 26: loss 10.7988, time 307.22ms
iter 27: loss 10.8011, time 310.79ms
iter 28: loss 10.8116, time 306.26ms
iter 29: loss 10.8024, time 303.40ms
iter 30: loss 10.7896, time 306.29ms
iter 31: loss 10.7739, time 304.27ms
iter 32: loss 10.7751, time 301.52ms
iter 33: loss 10.7738, time 299.67ms
iter 34: loss 10.7919, time 304.29ms
iter 35: loss 10.7413, time 302.1

Jan 29 '23 16:01 tombenj

Macbook Air M2 16GB gives ~300ms with mps using Shakespeare:

iter 23: loss 10.7863, time 305.73ms
iter 24: loss 10.8073, time 298.21ms
iter 25: loss 10.7988, time 308.87ms
iter 26: loss 10.7988, time 307.22ms
iter 27: loss 10.8011, time 310.79ms
iter 28: loss 10.8116, time 306.26ms
iter 29: loss 10.8024, time 303.40ms
iter 30: loss 10.7896, time 306.29ms
iter 31: loss 10.7739, time 304.27ms
iter 32: loss 10.7751, time 301.52ms
iter 33: loss 10.7738, time 299.67ms
iter 34: loss 10.7919, time 304.29ms
iter 35: loss 10.7413, time 302.1

and any asitop? here is mine: Screen Shot 2023-01-30 at 14 33 43 after training. run sample.py it said "No meta.pkl found, assuming GPT-2 encodings..." any idea?

Jan 30 '23 06:01 yangboz

That is fine. It should produce samples after that line...

Jan 30 '23 07:01 tombenj

That is fine. It should produce samples after that line...

yes, it rollback to GPT2 , does it equals to chatgpt's ?

Jan 30 '23 09:01 yangboz

That is fine. It should produce samples after that line...

yes, it rollback to GPT2 , does it equals to chatgpt's ?

Nope. ChatGPT is like GPT3.5 (better than GPT3's Davinci) + it has an encoder part + many other things.

Jan 30 '23 10:01 tombenj

Newbie question: I have an M2 macbook air, how to use mps for faster training? Does it use mps by default or do I need to manually set something? THanks.

Feb 19 '23 01:02 nexthybrid

@nexthybrid you can pass --device="mps" on the command line or specify device="mps" in the finetune_shakespeare.py script

Feb 19 '23 05:02 venzen

nanoGPT nanoGPT copied to clipboard

Training on "Shakespeare" dataset is faster by using MacBook Air (M2)

nanoGPT
nanoGPT copied to clipboard