Carson Lam

Results 5 comments of Carson Lam

So I did some research on my own and basically my first 2 questions can be answered by looking at the huggingface transformers repository: https://github.com/huggingface/transformers/blob/main/src/transformers/modeling_utils.py

@danjohnvelasco as long as you use the same name `self.lm_head`, when you load the pretrained model from the dictionary of parameters, these linear parameters will be replaced with the trained...

Hi @immortal3 I love the minimal implementation I'm having trouble reproducing the 25% speedup though. I've been using `time` to compare the two implementations and the 125M model for generating...

Wow, what great response time! here are the cells 0 ``` import numpy as np import torch from transformers import GPT2Tokenizer from transformers import TrainingArguments from accelerate import Accelerator from...

Thanks @muellerzr , I did as you say and in every cell above ``` args = (model, tokenizer, config) notebook_launcher(training_loop, args, num_processes=2) ``` I have verified that each has `torch.cuda.is_initialized()`...