adapters
adapters copied to clipboard
training not yielding any positive results
Environment info
-
adapter-transformers
version: current version - Platform: unix server
- Python version: 3.8
- PyTorch version (GPU?): PyTorch 11.1 , A100
Hello, I define my model as below to return logits;
class PLM(nn.Module):
def __init__(self, lm_name='roberta-base'):
super().__init__()
self.model = RobertaAdapterModel.from_pretrained(
lm_name
)
self.model.add_adapter("classification", set_active=True)
self.model.add_classification_head("classification", num_labels=2)
self.model.train_adapter('classification')
def forward(self, x):
"""Forward function of the models for classification."""
logits = self.model(x)[0] # tensor
return logits
The rest of my code uses an existing custom training loop for PyTorch. However, I'm not getting results (0% f1) from this. The same loop used with a vanilla fine-tuning approach gets 50-60% f1. What could be the problem? Thank you for your response
Hey @boscoj2008, the adapter setup and activation code you're showing doesn't seem to have any issues, so it's difficult to identify any problems from here. You could try to use the built-in AdapterTrainer
class to compare if this makes any difference. Maybe you could also share some further details on the fine-tuning approach that works, so we can have a look if there are any differences between that and the adapter setup.
As general hints: adapter training often needs more epochs to converge (e.g. > 10 compared to 3 for fine-tuning) and works best with a slightly higher learning rate (e.g. 1e-4).
Thanks @calpt for the insights. I have applied them and at least I see an improvement. May I ask about the invertible adapters used in MAD-X? Can the invertible adapters be used without language adapters , i.e., in conjunction with the task adapters?
This issue has been automatically marked as stale because it has been without activity for 90 days. This issue will be closed in 14 days unless you comment or remove the stale label.
Thanks @calpt for the insights. I have applied them and at least I see an improvement. May I ask about the invertible adapters used in MAD-X? Can the invertible adapters be used without language adapters , i.e., in conjunction with the task adapters?
Yes, in principle this is possible. You can configure invertible adapters similar to other adapter components in the adapter config (see here). Specifically, two attributes inv_adapter
and inv_adapter_reduction_factor
are responsible for invertible modules. E.g., it is possible to add a new adapter with the same configuration used in MAD-X with:
model.add_adapter("adapter_name", config=PfeifferConfig(invertible_adapter="nice", inv_adapter_reduction_factor=2))
This issue has been automatically marked as stale because it has been without activity for 90 days. This issue will be closed in 14 days unless you comment or remove the stale label.
This issue was closed because it was stale for 14 days without any activity.