OpenPrompt icon indicating copy to clipboard operation
OpenPrompt copied to clipboard

SoftVerbalizer not frozen in PromptForClassification pipeline

Open NtaylorOX opened this issue 2 years ago • 6 comments

According to the SoftVerbalizer script and my general understanding of what is desired in a frozen PLM training setting, the grouped_parameters_1 of the SoftVerbalizer should be frozen. However, in the current pipeline this does not happen.

I will do my best to showcase what I believe to be a bug, whereby the PLM is frozen but the SoftVerbalizer is not.

Below follows most of the OpenPrompt tutorial on the repos readme page.

from openprompt.data_utils import InputExample
classes = [ # There are two classes in Sentiment Analysis, one for negative and one for positive
    "negative",
    "positive"
]
dataset = [ # For simplicity, there's only two examples
    # text_a is the input text of the data, some other datasets may have multiple input sentences in one example.
    InputExample(
        guid = 0,
        text_a = "Albert Einstein was one of the greatest intellects of his time.",
    ),
    InputExample(
        guid = 1,
        text_a = "The film was badly made.",
    ),
]


from openprompt.prompts import ManualTemplate, SoftVerbalizer
promptTemplate = ManualTemplate(
    text = '{"placeholder":"text_a"} It was {"mask"}',
    tokenizer = tokenizer,
)

# setup the soft verbalizer
promptVerbalizer = SoftVerbalizer(tokenizer, plm, num_classes=2)


# instantiate the PromptForClassification model

from openprompt import PromptForClassification

# model with no freezing of the plm
promptModel = PromptForClassification(
    template = promptTemplate,
    plm = plm,
    verbalizer = promptVerbalizer,
    freeze_plm = True
)

After instantiating the promptModel, the PLM will have its paramters require_grad set to False. Now we can look at the number of tunable parameters - i.e. those in the whole prompt model that require_grad

# check number of params that require_grad

def get_n_trainable_params(model):    

    
    # all trainable
    num_total_trainable = sum(p.numel() for p in model.parameters() if p.requires_grad)
    
    # split into the plm and classisifcation head
    num_plm_trainable = sum(p.numel() for p in model.plm.parameters() if p.requires_grad)
    
    # template trainable
    try:
        num_template_trainable = sum(p.numel() for p in model.template.soft_embedding.parameters() if p.requires_grad)
    except:
        num_template_trainable = 0
    
    # verbalizer trainable 
    num_verbalizer_trainable = sum(p.numel() for p in model.verbalizer.parameters() if p.requires_grad)
    
    # assert sum of the two = total
    assert num_plm_trainable+num_template_trainable+num_verbalizer_trainable == num_total_trainable
    
    print(f"Number of trainable parameters of PLM: {num_plm_trainable}\n")
    print('#'*50)
    print(f"Number of trainable parameters of template: {num_template_trainable}\n")
    print('#'*50)
    print(f"Number of trainable parameters of verbalizer: {num_verbalizer_trainable}\n")
    print('#'*50)
    print(f"Total number of trainable parameters of whole model: {num_total_trainable}")
    print(f"Verbalizer grouped_parameters_1 require_grad: {model.verbalizer.group_parameters_1[0].requires_grad}")



get_n_trainable_params(promptModel)

Number of trainable parameters of PLM: 0

##################################################
Number of trainable parameters of template: 0

##################################################
Number of trainable parameters of verbalizer: 622660

##################################################
Total number of trainable parameters of whole model: 622660

Verbalizer grouped_parameters_1 require_grad: True

But the verbalizer in this case has all parameters with requires_grad = True - including grouped_parameters_1.

If you then re-intilialize the SoftVerbalizer and promptModel, with the now frozen PLM, it appears how it should be.


promptVerbalizer_2 = SoftVerbalizer(tokenizer, plm, num_classes=2)
promptModel_2 = PromptForClassification(
    template = promptTemplate,
    plm = plm,
    verbalizer = promptVerbalizer_2,
    freeze_plm = True
)

get_n_trainable_params(promptModel_2)

Number of trainable parameters of PLM: 0

##################################################
Number of trainable parameters of template: 0

##################################################
Number of trainable parameters of verbalizer: 1536

##################################################
Total number of trainable parameters of whole model: 1536

Verbalizer grouped_parameters_1 require_grad: False

This is a huge difference in the number of trainable parameters and I wonder if the latter scenario is actually the desired one in the frozen PLM setting?

See: https://github.com/thunlp/OpenPrompt/blob/4ba7cb380e7b42c19d566e9836dce7efdb2cc235/openprompt/prompts/soft_verbalizer.py#L82

NtaylorOX avatar Apr 05 '22 14:04 NtaylorOX

According to the SoftVerbalizer script and my general understanding of what is desired in a frozen PLM training setting, the grouped_parameters_1 of the SoftVerbalizer should be frozen. However, in the current pipeline this does not happen.

I will do my best to showcase what I believe to be a bug, whereby the PLM is frozen but the SoftVerbalizer is not.

Below follows most of the OpenPrompt tutorial on the repos readme page.

from openprompt.data_utils import InputExample
classes = [ # There are two classes in Sentiment Analysis, one for negative and one for positive
    "negative",
    "positive"
]
dataset = [ # For simplicity, there's only two examples
    # text_a is the input text of the data, some other datasets may have multiple input sentences in one example.
    InputExample(
        guid = 0,
        text_a = "Albert Einstein was one of the greatest intellects of his time.",
    ),
    InputExample(
        guid = 1,
        text_a = "The film was badly made.",
    ),
]


from openprompt.prompts import ManualTemplate, SoftVerbalizer
promptTemplate = ManualTemplate(
    text = '{"placeholder":"text_a"} It was {"mask"}',
    tokenizer = tokenizer,
)

# setup the soft verbalizer
promptVerbalizer = SoftVerbalizer(tokenizer, plm, num_classes=2)


# instantiate the PromptForClassification model

from openprompt import PromptForClassification

# model with no freezing of the plm
promptModel = PromptForClassification(
    template = promptTemplate,
    plm = plm,
    verbalizer = promptVerbalizer,
    freeze_plm = True
)

After instantiating the promptModel, the PLM will have its paramters require_grad set to False. Now we can look at the number of tunable parameters - i.e. those in the whole prompt model that require_grad

# check number of params that require_grad

def get_n_trainable_params(model):    

    
    # all trainable
    num_total_trainable = sum(p.numel() for p in model.parameters() if p.requires_grad)
    
    # split into the plm and classisifcation head
    num_plm_trainable = sum(p.numel() for p in model.plm.parameters() if p.requires_grad)
    
    # template trainable
    try:
        num_template_trainable = sum(p.numel() for p in model.template.soft_embedding.parameters() if p.requires_grad)
    except:
        num_template_trainable = 0
    
    # verbalizer trainable 
    num_verbalizer_trainable = sum(p.numel() for p in model.verbalizer.parameters() if p.requires_grad)
    
    # assert sum of the two = total
    assert num_plm_trainable+num_template_trainable+num_verbalizer_trainable == num_total_trainable
    
    print(f"Number of trainable parameters of PLM: {num_plm_trainable}\n")
    print('#'*50)
    print(f"Number of trainable parameters of template: {num_template_trainable}\n")
    print('#'*50)
    print(f"Number of trainable parameters of verbalizer: {num_verbalizer_trainable}\n")
    print('#'*50)
    print(f"Total number of trainable parameters of whole model: {num_total_trainable}")
    print(f"Verbalizer grouped_parameters_1 require_grad: {model.verbalizer.group_parameters_1[0].requires_grad}")



get_n_trainable_params(promptModel)

Number of trainable parameters of PLM: 0

##################################################
Number of trainable parameters of template: 0

##################################################
Number of trainable parameters of verbalizer: 622660

##################################################
Total number of trainable parameters of whole model: 622660

Verbalizer grouped_parameters_1 require_grad: True

But the verbalizer in this case has all parameters with requires_grad = True - including grouped_parameters_1.

If you then re-intilialize the SoftVerbalizer and promptModel, with the now frozen PLM, it appears how it should be.


promptVerbalizer_2 = SoftVerbalizer(tokenizer, plm, num_classes=2)
promptModel_2 = PromptForClassification(
    template = promptTemplate,
    plm = plm,
    verbalizer = promptVerbalizer_2,
    freeze_plm = True
)

get_n_trainable_params(promptModel_2)

Number of trainable parameters of PLM: 0

##################################################
Number of trainable parameters of template: 0

##################################################
Number of trainable parameters of verbalizer: 1536

##################################################
Total number of trainable parameters of whole model: 1536

Verbalizer grouped_parameters_1 require_grad: False

This is a huge difference in the number of trainable parameters and I wonder if the latter scenario is actually the desired one in the frozen PLM setting?

See:

https://github.com/thunlp/OpenPrompt/blob/4ba7cb380e7b42c19d566e9836dce7efdb2cc235/openprompt/prompts/soft_verbalizer.py#L82

In general, SoftVerbalizer does not require a freezing model setting (SoftPrompt could be used in a freezing model scenario), so I cannot entirely follow your question. If you have further questions, feel free to email us, thanks.

ningding97 avatar Jun 06 '22 10:06 ningding97

Seems my point may have been missed here. This wasn't a question - I am still quite sure this is a bug. I have shown that the PLM is not being frozen properly when using soft verbalizer. Which means some parameters of the PLM/ soft verbalizer will be updated during training. Unless I've mis understood something

NtaylorOX avatar Jun 06 '22 11:06 NtaylorOX

Thanks for the reply! Just want to add, that it comes to us because according to the original WARP paper, the soft verbalizer should be a weighted matrix, so the number of parameters (regardless of whether it can be tuned) is embedding_sizenum_of_class so 7682 = 1536 makes sense. So what is confusing is why in the first case the number of parameters to be tuned is that large? And also maybe there's a typo in the first block where 'freeze_plm' should be False.

NtaylorOX @.***> 于2022年6月6日周一 12:09写道:

Seems my point may have been missed here. This wasn't a question - I am still quite sure this is a bug. I have shown that the PLM is not being frozen properly when using soft verbalizer. Which means some parameters of the PLM/ soft verbalizer will be updated during training. Unless I've mis understood something

— Reply to this email directly, view it on GitHub https://github.com/thunlp/OpenPrompt/issues/134#issuecomment-1147330687, or unsubscribe https://github.com/notifications/unsubscribe-auth/AUIVF4ALOVNPB5UAHQ4LFOTVNXL7RANCNFSM5SS5ZPIQ . You are receiving this because you are subscribed to this thread.Message ID: @.***>

YiZhang025 avatar Jun 06 '22 11:06 YiZhang025

Hi,

Thank you for confirming! And yes this is essentially what I was trying to highlight with this issue.

My understanding is that when setting freeze_plm to true should freeze all plm params and leave the verbalizers parameters untouched. So in my first example this is not a typo.

This is why I referenced https://github.com/thunlp/OpenPrompt/blob/4ba7cb380e7b42c19d566e9836dce7efdb2cc235/openprompt/prompts/soft_verbalizer.py#L82 because it seems that part of the verbalizer is not being properly frozen.

If you inspect the verbalizer it seems to have parts of the PLM followed by the WARP structure of a weighted matrix for each class. It has been a while since I posted this, but I am still quite sure that based on the example provided, the latter promptModel_2 is the desired outcome of creating prompt model with a frozen PLM + soft verbalizer.

Hope I have been clear in my comments :)

NtaylorOX avatar Jun 06 '22 12:06 NtaylorOX

Oh, yes! It is a bug, thanks a lot, we will fix it soon!

ningding97 avatar Jun 06 '22 15:06 ningding97

@ningding97 When will this be fixed? I need to use it

MrigankRaman avatar Jun 27 '22 07:06 MrigankRaman