trl What's the difference between SFTTrainer（TRL） and Trainer（ Transformers）?

What's the difference between SFTTrainer（TRL） and Trainer（ Transformers）?

Open SatireY opened this issue 11 months ago • 1 comments

Hi, first of all I want to thank you for the excellent code that facilitated my research. I want to migrate the code I trained on the Trainer to SFTTrainer (train LLAMAv2). Here is my original training script:

tokenizer = LlamaTokenizer.from_pretrained(config.checkpoints.path)
tokenizer.pad_token = tokenizer.eos_token
tokenizer.padding_side = "right"

device_index = Accelerator().process_index
device_map = {"": device_index}

model = LlamaForCausalLM.from_pretrained(
    config.checkpoints.path,
    load_in_8bit=True,
    device_map=device_map,
    torch_dtype=torch.bfloat16,
)
# add tokens
tokenizer.add_tokens(...)

model.resize_token_embeddings(len(tokenizer))
train_dataset=...
eval_dataset=...
 peft_config = LoraConfig(
        task_type=TaskType.CAUSAL_LM,
        inference_mode=False,
        r=8,
        lora_alpha=32,
        lora_dropout=0.05,
        target_modules=["q_proj", "v_proj"],
        modules_to_save=["embed_tokens", "lm_head"],
    )

model = prepare_model_for_kbit_training(model)
model = get_peft_model(model, peft_config)
model.print_trainable_parameters()
output_dir = config.train.output_dir

train_config = {
    "lora_config": peft_config,
    "learning_rate": config.train.learning_rate,
    "num_train_epochs": config.train.num_train_epochs,
    "gradient_accumulation_steps": config.train.gradient_accumulation_steps,
    "per_device_eval_batch_size": config.train.per_device_eval_batch_size,
    "per_device_train_batch_size": config.train.per_device_train_batch_size,
    "gradient_checkpointing": config.train.gradient_checkpointing,
    "load_best_model_at_end": config.train.load_best_model_at_end,
}
# Define training args
training_args = TrainingArguments(
    output_dir=output_dir + "_longcode" if config.train.long_code else output_dir,
    report_to=config.train.report_to,
    overwrite_output_dir=config.train.overwrite_output_dir,
    bf16=config.train.bf16,  # Use BF16 if available
    # logging strategies
    logging_dir=f"{output_dir}/logs",
    logging_strategy=config.train.logging_strategy,
    logging_steps=config.train.logging_steps,
    save_strategy=config.train.save_strategy,
    save_total_limit=config.train.save_total_limit,
    # eval strategies
    evaluation_strategy=config.train.evaluation_strategy,
    eval_steps=config.train.eval_steps,
    # optim="adamw_torch_fused",
    optim=config.train.optim,
    max_steps=total_steps if enable_profiler else -1,
    ddp_find_unused_parameters=False,
    **{k: v for k, v in train_config.items() if k != "lora_config"},
)

data_collator = MyDataCollatorWithPadding(tokenizer=tokenizer)
# transformers
trainer = Trainer(
    model=model,
    args=training_args,
    train_dataset=train_dataset,
    eval_dataset=eval_dataset,
    data_collator=data_collator,
    callbacks=[profiler_callback] if enable_profiler else [],
)
trainer.train()

There are a few things I'm confused about the SFTTrainer API:

Does SFTTrainer accept raw text instead of tokens (inputs_ids, attention_mask, labels)? In the above script, train_dataset and eval_dataset are the tokens after processing, which contain [inputs_ids, attention_mask, labels] columns.
Will SFTTrainer unify all data to the 'max_seq_length' length?
I am dealing with a question and answer task now. How is the SFTTrainer trained? Does it process 'labels' column data? In particular, the "response" required in my task has a fixed length. @younesbelkada I see you answering a lot of questions in the community, so I would like to ask for your help, thank you.

Feb 28 '24 09:02 SatireY

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Mar 29 '24 15:03 github-actions[bot]

trl trl copied to clipboard

What's the difference between SFTTrainer（TRL） and Trainer（ Transformers）?

trl
trl copied to clipboard