trl
trl copied to clipboard
What's the difference between SFTTrainer(TRL) and Trainer( Transformers)?
Hi, first of all I want to thank you for the excellent code that facilitated my research. I want to migrate the code I trained on the Trainer to SFTTrainer (train LLAMAv2). Here is my original training script:
tokenizer = LlamaTokenizer.from_pretrained(config.checkpoints.path)
tokenizer.pad_token = tokenizer.eos_token
tokenizer.padding_side = "right"
device_index = Accelerator().process_index
device_map = {"": device_index}
model = LlamaForCausalLM.from_pretrained(
config.checkpoints.path,
load_in_8bit=True,
device_map=device_map,
torch_dtype=torch.bfloat16,
)
# add tokens
tokenizer.add_tokens(...)
model.resize_token_embeddings(len(tokenizer))
train_dataset=...
eval_dataset=...
peft_config = LoraConfig(
task_type=TaskType.CAUSAL_LM,
inference_mode=False,
r=8,
lora_alpha=32,
lora_dropout=0.05,
target_modules=["q_proj", "v_proj"],
modules_to_save=["embed_tokens", "lm_head"],
)
model = prepare_model_for_kbit_training(model)
model = get_peft_model(model, peft_config)
model.print_trainable_parameters()
output_dir = config.train.output_dir
train_config = {
"lora_config": peft_config,
"learning_rate": config.train.learning_rate,
"num_train_epochs": config.train.num_train_epochs,
"gradient_accumulation_steps": config.train.gradient_accumulation_steps,
"per_device_eval_batch_size": config.train.per_device_eval_batch_size,
"per_device_train_batch_size": config.train.per_device_train_batch_size,
"gradient_checkpointing": config.train.gradient_checkpointing,
"load_best_model_at_end": config.train.load_best_model_at_end,
}
# Define training args
training_args = TrainingArguments(
output_dir=output_dir + "_longcode" if config.train.long_code else output_dir,
report_to=config.train.report_to,
overwrite_output_dir=config.train.overwrite_output_dir,
bf16=config.train.bf16, # Use BF16 if available
# logging strategies
logging_dir=f"{output_dir}/logs",
logging_strategy=config.train.logging_strategy,
logging_steps=config.train.logging_steps,
save_strategy=config.train.save_strategy,
save_total_limit=config.train.save_total_limit,
# eval strategies
evaluation_strategy=config.train.evaluation_strategy,
eval_steps=config.train.eval_steps,
# optim="adamw_torch_fused",
optim=config.train.optim,
max_steps=total_steps if enable_profiler else -1,
ddp_find_unused_parameters=False,
**{k: v for k, v in train_config.items() if k != "lora_config"},
)
data_collator = MyDataCollatorWithPadding(tokenizer=tokenizer)
# transformers
trainer = Trainer(
model=model,
args=training_args,
train_dataset=train_dataset,
eval_dataset=eval_dataset,
data_collator=data_collator,
callbacks=[profiler_callback] if enable_profiler else [],
)
trainer.train()
There are a few things I'm confused about the SFTTrainer API:
- Does SFTTrainer accept raw text instead of tokens (inputs_ids, attention_mask, labels)? In the above script, train_dataset and eval_dataset are the tokens after processing, which contain [inputs_ids, attention_mask, labels] columns.
- Will SFTTrainer unify all data to the 'max_seq_length' length?
- I am dealing with a question and answer task now. How is the SFTTrainer trained? Does it process 'labels' column data? In particular, the "response" required in my task has a fixed length. @younesbelkada I see you answering a lot of questions in the community, so I would like to ask for your help, thank you.
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.