offsite-tuning icon indicating copy to clipboard operation
offsite-tuning copied to clipboard

Offsite-Tuning: Transfer Learning without Full Model

Results 6 offsite-tuning issues
Sort by recently updated
recently updated
newest added

Hi, I noticed that you trained the NLP emulator with the first 30 chunks of Pile dataset. I wonder how large are the 30 chunks? Or in other words, how...

offsite_tuning/run_image_classification.py def to_teacher(model, args): l = args.student_l_pad print(type(model.model)) if isinstance(model, OPTForCausalLM): r = len(model.model.decoder.layers) - args.student_r_pad model.model.decoder.layers = model.model.decoder.layers[ :l] + model.teacher + model.model.decoder.layers[r:] elif isinstance(model, GPT2LMHeadModel): r = len(model.transformer.h)...

In data.py, process_text2text_datasets ‘’‘python def process_text2text_datasets(raw_datasets, args, tokenizer, accelerator): task = task_dict[args.dataset_name] column_names = raw_datasets["train"].column_names def tokenize_function(examples): context = task.get_context(examples) target = task.get_target(examples) context = tokenizer(context) target = tokenizer(target) #...

It seems all the eval for LLMs are done using 1 GPUs can you suggest ways to run distributed eval?

Great work with an elegant but effective idea! Thanks for sharing. However, I have a minor suggestion. It is well-known that in the LLM finetuning paradigm, adapter-tuning [1] — done...