nanoGPT
nanoGPT copied to clipboard
cpu support
First of all, tank you Andrej Karpathy for the amazing Youtube series you have done on deep learning and for being such a great educator on the subject online.
Regarding this PR. I noticed that there was support for the poor mans configurator as well as data loader but I failed to find support for the poor mans hardware. I therefore added it. It might also be useful in case high performance, or performance in general, isn't a requirement or one would like to use the computer as some sort of electric heating system.
# in case there is no GPU but a lot of time to spend
python train.py finetune_shakespeare --device=cpu --init_from=gpt2
haha good idea actually thanks. Should we detect device type based on device though? Leaning to yes
Leaning towards a yes on that one as well... Fixed it.. will update the comment above so it now shows how it is done!
Hi @Ricardicus ,
Thank you for updating the code, but I tested it in my local environment with --device=cpu and encountered an error stating that the AutocastCPU feature only supports bfloat16 data types.
RuntimeError: Currently, AutocastCPU only supports Bfloat16 as the autocast_cpu_dtype
some suggestions?
Hi @lzeladam ,
First of all, don't use the cpu if you have a gpu hehe ;) But I don't have this issue, since I and the code use bfloat16. Have you changed the dtype in the code perhaps?
Both line 196 and 256 look like this, which suggests you may have fiddled a little with the dtype argument:
with torch.amp.autocast(device_type=device_type, dtype=torch.bfloat16):
I have updated so that device type is set to cpu when device is set to cpu. There is only cpu and cuda as device types in PyTorch as it stands today right?
Moved the device type from the configurator area, so that it is only set based on device. Made more sense
I think this is part of a bigger refactor I'd like to make because I want both device and dtype to be configurable from args. And potentially apply the autocast context manager only where it makes sense (e.g. 'cuda' and 'bfloat16'). And 'float16' is currently not super supported because there is no gradient scaler...
CPU support is now added so closing this issue ty
@Ricardicus I've been using your LSTM implementation. Would you want to collaborate and write a C version of nanoGPT. I think it would be a great addition to your AI portfolio.
P.S I was browsing through nanoGPT's issue history and I recognized your username. Amazing work!