nanoGPT
nanoGPT copied to clipboard
The simplest, fastest repository for training/finetuning medium-sized GPTs.
Hi, Is there any way we could implement perplexity from ```torchmetrics``` to register perplexity along training?
Best to select the device based on what's available (for majority of cases)
- https://github.com/tesla-cat/nanoDeepSeek/blob/main/compare_nanoGPT.py - https://github.com/tesla-cat/nanoDeepSeek/tree/main/src/nanoGPT ## model ```py import math from dataclasses import dataclass from typing import Dict import torch as tc import torch.nn as nn from torch.nn import functional as...
For beginners or for educational purposes, you might consider the project [mini-nanoGPT](https://github.com/ystemsrx/mini-nanoGPT). This is the GUI version of nanoGPT, which implements nearly all of its functionalities—sometimes even in more refined...
Can someone help on this issue? (mlenv) C:\Users\Admin\nanoGPT>python train.py config/train_shakespeare_char.py Overriding config with config/train_shakespeare_char.py: # train a miniature character-level shakespeare model # good for debugging and playing on macbooks and...
any thoughts..?
Summary This PR addresses two things, the extension of model_ext.py and train_sat.py from leyan_branch with my additions from the previous PR. Second it addresses some run-time issues in the flash...
# Post Training Quantization for GPT2 - In this commit, I try to add the Quantization for the GPT2 most of the code remains the same. - Made the new...
apply rotary embeddings
Option 1 - Flag Ablation code - ReLU - Restricted Setting - Softmax q-error + Option 2 - Flag Scale Attention Weights based on the context window * log(query_position) [error...