ValueError: Target module CLIPEncoderLayer,only the following modules are supported: `torch.nn.Linear`, `torch.nn.Embedding`, `torch.nn.Conv2d`, `transformers.pytorch_utils.Conv1D`.
ValueError: Target module CLIPEncoderLayer(
(self_attn): CLIPAttention(
(k_proj): Linear4bit(in_features=1024, out_features=1024, bias=True)
(v_proj): Linear4bit(in_features=1024, out_features=1024, bias=True)
(q_proj): Linear4bit(in_features=1024, out_features=1024, bias=True)
(out_proj): Linear4bit(in_features=1024, out_features=1024, bias=True)
)
(layer_norm1): LayerNorm((1024,), eps=1e-05, elementwise_affine=True)
(mlp): CLIPMLP(
(activation_fn): QuickGELUActivation()
(fc1): Linear4bit(in_features=1024, out_features=4096, bias=True)
(fc2): Linear4bit(in_features=4096, out_features=1024, bias=True)
)
(layer_norm2): LayerNorm((1024,), eps=1e-05, elementwise_affine=True)
) is not supported. Currently, only the following modules are supported: torch.nn.Linear, torch.nn.Embedding, torch.nn.Conv2d, transformers.pytorch_utils.Conv1D.
I am not sure what the exact problem is since I do not know the details of the code you wrote, but from the error message it looks like LoRA is being applied to a container module such as CLIPEncoderLayer. PEFT only supports certain module types directly, like torch.nn.Linear, torch.nn.Embedding, torch.nn.Conv2d, and transformers.pytorch_utils.Conv1D.
In CLIP, the attention and MLP blocks wrap these linear layers. Instead of targeting the entire encoder layer, you may need to specify the inner projection layers directly, for example:
target_modules = ["q_proj", "k_proj", "v_proj", "out_proj", "fc1", "fc2"]
These are linear submodules inside CLIP that are compatible with LoRA and QLoRA. It might also be helpful if the error message or documentation made this distinction clearer.
\take