adapters icon indicating copy to clipboard operation
adapters copied to clipboard

training not yielding any positive results

Open boscoj2008 opened this issue 2 years ago • 2 comments

Environment info

  • adapter-transformers version: current version
  • Platform: unix server
  • Python version: 3.8
  • PyTorch version (GPU?): PyTorch 11.1 , A100

Hello, I define my model as below to return logits;

class PLM(nn.Module):
    def __init__(self, lm_name='roberta-base'):
        super().__init__()

        self.model = RobertaAdapterModel.from_pretrained(
                                                      lm_name
	                                              )
        self.model.add_adapter("classification", set_active=True)
                                                                                                                                                                  

        self.model.add_classification_head("classification", num_labels=2)
        self.model.train_adapter('classification')

    def forward(self, x):

        """Forward function of the models for classification."""

        logits = self.model(x)[0] # tensor                                                                                                                                                                            

        return logits

The rest of my code uses an existing custom training loop for PyTorch. However, I'm not getting results (0% f1) from this. The same loop used with a vanilla fine-tuning approach gets 50-60% f1. What could be the problem? Thank you for your response

boscoj2008 avatar Aug 23 '22 16:08 boscoj2008

Hey @boscoj2008, the adapter setup and activation code you're showing doesn't seem to have any issues, so it's difficult to identify any problems from here. You could try to use the built-in AdapterTrainer class to compare if this makes any difference. Maybe you could also share some further details on the fine-tuning approach that works, so we can have a look if there are any differences between that and the adapter setup. As general hints: adapter training often needs more epochs to converge (e.g. > 10 compared to 3 for fine-tuning) and works best with a slightly higher learning rate (e.g. 1e-4).

calpt avatar Sep 14 '22 10:09 calpt

Thanks @calpt for the insights. I have applied them and at least I see an improvement. May I ask about the invertible adapters used in MAD-X? Can the invertible adapters be used without language adapters , i.e., in conjunction with the task adapters?

boscoj2008 avatar Oct 11 '22 14:10 boscoj2008

This issue has been automatically marked as stale because it has been without activity for 90 days. This issue will be closed in 14 days unless you comment or remove the stale label.

adapter-hub-bert avatar Jan 10 '23 06:01 adapter-hub-bert

Thanks @calpt for the insights. I have applied them and at least I see an improvement. May I ask about the invertible adapters used in MAD-X? Can the invertible adapters be used without language adapters , i.e., in conjunction with the task adapters?

Yes, in principle this is possible. You can configure invertible adapters similar to other adapter components in the adapter config (see here). Specifically, two attributes inv_adapter and inv_adapter_reduction_factor are responsible for invertible modules. E.g., it is possible to add a new adapter with the same configuration used in MAD-X with:

model.add_adapter("adapter_name", config=PfeifferConfig(invertible_adapter="nice", inv_adapter_reduction_factor=2))

calpt avatar Jan 10 '23 08:01 calpt

This issue has been automatically marked as stale because it has been without activity for 90 days. This issue will be closed in 14 days unless you comment or remove the stale label.

adapter-hub-bert avatar Apr 12 '23 06:04 adapter-hub-bert

This issue was closed because it was stale for 14 days without any activity.

adapter-hub-bert avatar Apr 27 '23 06:04 adapter-hub-bert