WARP icon indicating copy to clipboard operation
WARP copied to clipboard

The detail about manual prompts

Open sinan106 opened this issue 2 years ago • 2 comments

Hi,

Thank you for your work and for releasing the code! After reading your paper, I am confused about the manual prompts.

""" In addition to the regular models where we initialize with [MASK] tokens, we performed a run on the GLUE datasets with the same prompt [CLS] "S1"? [MASK]. "S2"! [SEP] for all the tasks """ In the manual prompts, I want to know where to insert the prompts. Wouldn't the original WARP also have [CLS], [SEP] and [MASK] special tokens? What is the difference between WARPinit and WARP8 in the insertion position of prompts?

I don't know much about this field, thank you very much for answering my question

sinan106 avatar Sep 19 '22 13:09 sinan106

Dear @sinan106,

  1. Prompt Template - pre-defined (as a hyperparameter) places of where to put the trainable WARP parameters. An example of a template will be - [CLS] [P1] [P2] [S1] [P3] [MASK] [P4] [S2] [P5] [P6] [SEP] which means that the [P1]...[P6] positions have trainable word embeddings.
  2. Prompt Parameters / Embeddings - trainable - parameters / embeddings for [P1]...[P6] respectively.

The only difference between WARP8 and WARPinit is that the original WARP8 are initialized randomly (or with a fixed vector) and WARPinit parameters are initialized with actual word embeddings of ", ", ?, ., ", ", !.

Thus, the WARPinit is as good as zero-shot classifiers before any training steps.

""" In addition to the regular models where we initialize with [MASK] tokens, we performed a run on the GLUE datasets with the same prompt [CLS] "S1"? [MASK]. "S2"! [SEP] for all the tasks """

(This paragraph is only about WARPinit. )

I hope this was helpful!

mahnerak avatar Sep 19 '22 23:09 mahnerak

Thanks a lot for your answer!!

In conclusion

  1. the $WARP_1$ ~ $WARP_20$ and the $WARP_init$ all follow the same Prompt Template.
  2. The difference is that the initialization of the Prompt Embedding.
  3. The prompt length of the $WARP_init$ is also 8.

I hope my understanding is correct, thanks again for your answer!

sinan106 avatar Sep 20 '22 00:09 sinan106