peft
                                
                                 peft copied to clipboard
                                
                                    peft copied to clipboard
                            
                            
                            
                        How to set Lora_dropout=0 when loading trained peft model for inference?
System Info
peft==0.10.0 transformers==4.39.3
Who can help?
No response
Information
- [ ] The official example scripts
- [ ] My own modified scripts
Tasks
- [ ] An officially supported task in the examplesfolder
- [ ] My own task or dataset (give details below)
Reproduction
class Linear(nn.Module, LoraLayer): 
   def forward(self, x: torch.Tensor, *args: Any, **kwargs: Any) -> torch.Tensor:
        self._check_forward_args(x, *args, **kwargs)
        adapter_names = kwargs.pop("adapter_names", None)
        if self.disable_adapters:
            if self.merged:
                self.unmerge()
            result = self.base_layer(x, *args, **kwargs)
        elif adapter_names is not None:
            result = self._mixed_batch_forward(x, *args, adapter_names=adapter_names, **kwargs)
        elif self.merged:
            result = self.base_layer(x, *args, **kwargs)
        else:
            result = self.base_layer(x, *args, **kwargs)
            torch_result_dtype = result.dtype
            for active_adapter in self.active_adapters:
                if active_adapter not in self.lora_A.keys():
                    continue
                lora_A = self.lora_A[active_adapter]
                lora_B = self.lora_B[active_adapter]
                dropout = self.lora_dropout[active_adapter]
                scaling = self.scaling[active_adapter]
                x = x.to(lora_A.weight.dtype)
                if not self.use_dora[active_adapter]:
                    result = result + lora_B(lora_A(dropout(x))) * scaling
                else:
                    x = dropout(x)
                    result = result + self._apply_dora(x, lora_A, lora_B, scaling, active_adapter)
            result = result.to(torch_result_dtype)
        return result
Expected behavior
We can see that lora_dropout in forward function is working the same way whether under train or inference mode.
We can see that
lora_dropoutin forward function is working the same way whether under train or inference mode.
Did you try it out? The nn.Dropout layer is not applying dropout unless it is in training mode. Moreover, when we set dropout to 0 at initialization, self.dropout is set to nn.Identity. Please check if dropout is really applied in your case or if it's a misunderstanding of the code.
We can see that
lora_dropoutin forward function is working the same way whether under train or inference mode.Did you try it out? The
nn.Dropoutlayer is not applying dropout unless it is intrainingmode. Moreover, when we set dropout to 0 at initialization,self.dropoutis set tonn.Identity. Please check if dropout is really applied in your case or if it's a misunderstanding of the code.
Thank you! The key point is training mode of the model! I train the model without evaluation, so the model after training is still in 'training' mode, which led to inconsistent performance between the model and the one loaded from a checkpoint.