setfit icon indicating copy to clipboard operation
setfit copied to clipboard

Clarification on end_to_end vs trainer.train_embeddings

Open naddeoa opened this issue 1 year ago • 2 comments

I'm experimenting with a simple model right now and I'm confused about whether or not I should expect the sentence transformer model to change during the training process.

    # Define the model and training arguments
    model = SetFitModel.from_pretrained(
        "sentence-transformers/all-MiniLM-L6-v2",
        multi_target_strategy="one-vs-rest",
        use_differentiable_head=True,
        head_params={"out_features": len(labels)},
        labels=labels,
    )

    args = TrainingArguments(
        batch_size=128,
        # end_to_end=False,
        # body_learning_rate=10.0,
        num_epochs=4,
        evaluation_strategy="no",
        save_strategy="no",
        load_best_model_at_end=True,
    )

    trainer = Trainer(
        model=model,
        args=args,
        train_dataset=train_dataset,
        eval_dataset=eval_dataset,
        metric="accuracy",
        column_mapping={
            "text": "text",
            "label": "label",
        },  # Map dataset columns to text/label expected by trainer
    )

The documentation for end_to_end implies that the only time that the underlying model will change is when this argument is set, but experimentally that isn't true. The underlying sentence transformer (the "body" as I understand it) seems to always be trained in the train() logic, which is hard coded to always call train_embeddings(). I determined that the body changed by comparing the output scores of my model as well as the embeddings generated by the base sentence transformer model and the one that is set to my model body after training.

Did I misunderstand the docs? The only way I can get this to not happen is to comment out the train_embeddings() call in the setlib library's train() here

naddeoa avatar May 29 '24 04:05 naddeoa