minGPT icon indicating copy to clipboard operation
minGPT copied to clipboard

A minimal PyTorch re-implementation of the OpenAI GPT (Generative Pretrained Transformer) training

Results 79 minGPT issues
Sort by recently updated
recently updated
newest added

# Motivation People may want to use minGPT in different precisions (fp16, fp32, bf16). This PR integrates this feature to add this possibility to the library. Only floating point precisions...

No actual functionality changes, just a lot of cleanup to make the project look like it belongs in the present instead of something half abandoned from 10 years ago. Some...

Dear @karpathy, Thanks for this nice GPT implementation. Really helps a lot! When I comparing this GPT with other Transformers, I found that here all the attention layers was using...

Hi, this is an awesome repository. I was reading on `AdditionDataset` and noticed the block size is calculated as follows: ```python # +1 due to potential carry overflow, but then...

I'm experimenting with `amp.autocast` (automatic mixed precision) with torch version, '1.7.0a0+7036e91' I find the performance has _not improved_ (but slight degraded, 36.2s for AMP vs 30.4s for FP32 with N=1...

use config n_head instead of hardcoded 4 heads in model attention block

I think that you deliberately avoided this problem in the code :) If so, how should it be done?

`test_loss` is not defined for cases when test_dataset is None. So, I just added the condition to check if test_dataset is present or not.

I wish minGTP could also work with HuggingFace hub in order to be even more easily experienced by any user. I admit I don't know if this is feasible while...

In the pull request, I implemented a play_word example, which trains a GPT model to translate from modern English to Shakespeare English. It uses a BPE for building vocabulary.