OpenPrompt
OpenPrompt copied to clipboard
SoftVerbalizer not frozen in PromptForClassification pipeline
According to the SoftVerbalizer script and my general understanding of what is desired in a frozen PLM training setting, the grouped_parameters_1 of the SoftVerbalizer should be frozen. However, in the current pipeline this does not happen.
I will do my best to showcase what I believe to be a bug, whereby the PLM is frozen but the SoftVerbalizer is not.
Below follows most of the OpenPrompt tutorial on the repos readme page.
from openprompt.data_utils import InputExample
classes = [ # There are two classes in Sentiment Analysis, one for negative and one for positive
"negative",
"positive"
]
dataset = [ # For simplicity, there's only two examples
# text_a is the input text of the data, some other datasets may have multiple input sentences in one example.
InputExample(
guid = 0,
text_a = "Albert Einstein was one of the greatest intellects of his time.",
),
InputExample(
guid = 1,
text_a = "The film was badly made.",
),
]
from openprompt.prompts import ManualTemplate, SoftVerbalizer
promptTemplate = ManualTemplate(
text = '{"placeholder":"text_a"} It was {"mask"}',
tokenizer = tokenizer,
)
# setup the soft verbalizer
promptVerbalizer = SoftVerbalizer(tokenizer, plm, num_classes=2)
# instantiate the PromptForClassification model
from openprompt import PromptForClassification
# model with no freezing of the plm
promptModel = PromptForClassification(
template = promptTemplate,
plm = plm,
verbalizer = promptVerbalizer,
freeze_plm = True
)
After instantiating the promptModel, the PLM will have its paramters require_grad set to False. Now we can look at the number of tunable parameters - i.e. those in the whole prompt model that require_grad
# check number of params that require_grad
def get_n_trainable_params(model):
# all trainable
num_total_trainable = sum(p.numel() for p in model.parameters() if p.requires_grad)
# split into the plm and classisifcation head
num_plm_trainable = sum(p.numel() for p in model.plm.parameters() if p.requires_grad)
# template trainable
try:
num_template_trainable = sum(p.numel() for p in model.template.soft_embedding.parameters() if p.requires_grad)
except:
num_template_trainable = 0
# verbalizer trainable
num_verbalizer_trainable = sum(p.numel() for p in model.verbalizer.parameters() if p.requires_grad)
# assert sum of the two = total
assert num_plm_trainable+num_template_trainable+num_verbalizer_trainable == num_total_trainable
print(f"Number of trainable parameters of PLM: {num_plm_trainable}\n")
print('#'*50)
print(f"Number of trainable parameters of template: {num_template_trainable}\n")
print('#'*50)
print(f"Number of trainable parameters of verbalizer: {num_verbalizer_trainable}\n")
print('#'*50)
print(f"Total number of trainable parameters of whole model: {num_total_trainable}")
print(f"Verbalizer grouped_parameters_1 require_grad: {model.verbalizer.group_parameters_1[0].requires_grad}")
get_n_trainable_params(promptModel)
Number of trainable parameters of PLM: 0
##################################################
Number of trainable parameters of template: 0
##################################################
Number of trainable parameters of verbalizer: 622660
##################################################
Total number of trainable parameters of whole model: 622660
Verbalizer grouped_parameters_1 require_grad: True
But the verbalizer in this case has all parameters with requires_grad = True - including grouped_parameters_1.
If you then re-intilialize the SoftVerbalizer and promptModel, with the now frozen PLM, it appears how it should be.
promptVerbalizer_2 = SoftVerbalizer(tokenizer, plm, num_classes=2)
promptModel_2 = PromptForClassification(
template = promptTemplate,
plm = plm,
verbalizer = promptVerbalizer_2,
freeze_plm = True
)
get_n_trainable_params(promptModel_2)
Number of trainable parameters of PLM: 0
##################################################
Number of trainable parameters of template: 0
##################################################
Number of trainable parameters of verbalizer: 1536
##################################################
Total number of trainable parameters of whole model: 1536
Verbalizer grouped_parameters_1 require_grad: False
This is a huge difference in the number of trainable parameters and I wonder if the latter scenario is actually the desired one in the frozen PLM setting?
See: https://github.com/thunlp/OpenPrompt/blob/4ba7cb380e7b42c19d566e9836dce7efdb2cc235/openprompt/prompts/soft_verbalizer.py#L82
According to the SoftVerbalizer script and my general understanding of what is desired in a frozen PLM training setting, the grouped_parameters_1 of the SoftVerbalizer should be frozen. However, in the current pipeline this does not happen.
I will do my best to showcase what I believe to be a bug, whereby the PLM is frozen but the SoftVerbalizer is not.
Below follows most of the OpenPrompt tutorial on the repos readme page.
from openprompt.data_utils import InputExample classes = [ # There are two classes in Sentiment Analysis, one for negative and one for positive "negative", "positive" ] dataset = [ # For simplicity, there's only two examples # text_a is the input text of the data, some other datasets may have multiple input sentences in one example. InputExample( guid = 0, text_a = "Albert Einstein was one of the greatest intellects of his time.", ), InputExample( guid = 1, text_a = "The film was badly made.", ), ] from openprompt.prompts import ManualTemplate, SoftVerbalizer promptTemplate = ManualTemplate( text = '{"placeholder":"text_a"} It was {"mask"}', tokenizer = tokenizer, ) # setup the soft verbalizer promptVerbalizer = SoftVerbalizer(tokenizer, plm, num_classes=2) # instantiate the PromptForClassification model from openprompt import PromptForClassification # model with no freezing of the plm promptModel = PromptForClassification( template = promptTemplate, plm = plm, verbalizer = promptVerbalizer, freeze_plm = True )
After instantiating the promptModel, the PLM will have its paramters require_grad set to False. Now we can look at the number of tunable parameters - i.e. those in the whole prompt model that require_grad
# check number of params that require_grad def get_n_trainable_params(model): # all trainable num_total_trainable = sum(p.numel() for p in model.parameters() if p.requires_grad) # split into the plm and classisifcation head num_plm_trainable = sum(p.numel() for p in model.plm.parameters() if p.requires_grad) # template trainable try: num_template_trainable = sum(p.numel() for p in model.template.soft_embedding.parameters() if p.requires_grad) except: num_template_trainable = 0 # verbalizer trainable num_verbalizer_trainable = sum(p.numel() for p in model.verbalizer.parameters() if p.requires_grad) # assert sum of the two = total assert num_plm_trainable+num_template_trainable+num_verbalizer_trainable == num_total_trainable print(f"Number of trainable parameters of PLM: {num_plm_trainable}\n") print('#'*50) print(f"Number of trainable parameters of template: {num_template_trainable}\n") print('#'*50) print(f"Number of trainable parameters of verbalizer: {num_verbalizer_trainable}\n") print('#'*50) print(f"Total number of trainable parameters of whole model: {num_total_trainable}") print(f"Verbalizer grouped_parameters_1 require_grad: {model.verbalizer.group_parameters_1[0].requires_grad}") get_n_trainable_params(promptModel) Number of trainable parameters of PLM: 0 ################################################## Number of trainable parameters of template: 0 ################################################## Number of trainable parameters of verbalizer: 622660 ################################################## Total number of trainable parameters of whole model: 622660 Verbalizer grouped_parameters_1 require_grad: True
But the verbalizer in this case has all parameters with requires_grad = True - including grouped_parameters_1.
If you then re-intilialize the SoftVerbalizer and promptModel, with the now frozen PLM, it appears how it should be.
promptVerbalizer_2 = SoftVerbalizer(tokenizer, plm, num_classes=2) promptModel_2 = PromptForClassification( template = promptTemplate, plm = plm, verbalizer = promptVerbalizer_2, freeze_plm = True ) get_n_trainable_params(promptModel_2) Number of trainable parameters of PLM: 0 ################################################## Number of trainable parameters of template: 0 ################################################## Number of trainable parameters of verbalizer: 1536 ################################################## Total number of trainable parameters of whole model: 1536 Verbalizer grouped_parameters_1 require_grad: False
This is a huge difference in the number of trainable parameters and I wonder if the latter scenario is actually the desired one in the frozen PLM setting?
See:
https://github.com/thunlp/OpenPrompt/blob/4ba7cb380e7b42c19d566e9836dce7efdb2cc235/openprompt/prompts/soft_verbalizer.py#L82
In general, SoftVerbalizer does not require a freezing model setting (SoftPrompt could be used in a freezing model scenario), so I cannot entirely follow your question. If you have further questions, feel free to email us, thanks.
Seems my point may have been missed here. This wasn't a question - I am still quite sure this is a bug. I have shown that the PLM is not being frozen properly when using soft verbalizer. Which means some parameters of the PLM/ soft verbalizer will be updated during training. Unless I've mis understood something
Thanks for the reply! Just want to add, that it comes to us because according to the original WARP paper, the soft verbalizer should be a weighted matrix, so the number of parameters (regardless of whether it can be tuned) is embedding_sizenum_of_class so 7682 = 1536 makes sense. So what is confusing is why in the first case the number of parameters to be tuned is that large? And also maybe there's a typo in the first block where 'freeze_plm' should be False.
NtaylorOX @.***> 于2022年6月6日周一 12:09写道:
Seems my point may have been missed here. This wasn't a question - I am still quite sure this is a bug. I have shown that the PLM is not being frozen properly when using soft verbalizer. Which means some parameters of the PLM/ soft verbalizer will be updated during training. Unless I've mis understood something
— Reply to this email directly, view it on GitHub https://github.com/thunlp/OpenPrompt/issues/134#issuecomment-1147330687, or unsubscribe https://github.com/notifications/unsubscribe-auth/AUIVF4ALOVNPB5UAHQ4LFOTVNXL7RANCNFSM5SS5ZPIQ . You are receiving this because you are subscribed to this thread.Message ID: @.***>
Hi,
Thank you for confirming! And yes this is essentially what I was trying to highlight with this issue.
My understanding is that when setting freeze_plm to true should freeze all plm params and leave the verbalizers parameters untouched. So in my first example this is not a typo.
This is why I referenced https://github.com/thunlp/OpenPrompt/blob/4ba7cb380e7b42c19d566e9836dce7efdb2cc235/openprompt/prompts/soft_verbalizer.py#L82 because it seems that part of the verbalizer is not being properly frozen.
If you inspect the verbalizer it seems to have parts of the PLM followed by the WARP structure of a weighted matrix for each class. It has been a while since I posted this, but I am still quite sure that based on the example provided, the latter promptModel_2 is the desired outcome of creating prompt model with a frozen PLM + soft verbalizer.
Hope I have been clear in my comments :)
Oh, yes! It is a bug, thanks a lot, we will fix it soon!
@ningding97 When will this be fixed? I need to use it