WARP
WARP copied to clipboard
The detail about manual prompts
Hi,
Thank you for your work and for releasing the code! After reading your paper, I am confused about the manual prompts.
""" In addition to the regular models where we initialize with [MASK] tokens, we performed a run on the GLUE datasets with the same prompt [CLS] "S1"? [MASK]. "S2"! [SEP] for all the tasks """ In the manual prompts, I want to know where to insert the prompts. Wouldn't the original WARP also have [CLS], [SEP] and [MASK] special tokens? What is the difference between WARPinit and WARP8 in the insertion position of prompts?
I don't know much about this field, thank you very much for answering my question
Dear @sinan106,
- Prompt Template - pre-defined (as a hyperparameter) places of where to put the trainable WARP parameters. An example of a template will be -
[CLS] [P1] [P2] [S1] [P3] [MASK] [P4] [S2] [P5] [P6] [SEP]
which means that the[P1]...[P6]
positions have trainable word embeddings. - Prompt Parameters / Embeddings - trainable - parameters / embeddings for
[P1]...[P6]
respectively.
The only difference between WARP8 and WARPinit is that the original WARP8 are initialized randomly (or with a fixed vector) and WARPinit parameters are initialized with actual word embeddings of "
, "
, ?
, .
, "
, "
, !
.
Thus, the WARPinit is as good as zero-shot classifiers before any training steps.
""" In addition to the regular models where we initialize with [MASK] tokens, we performed a run on the GLUE datasets with the same prompt [CLS] "S1"? [MASK]. "S2"! [SEP] for all the tasks """
(This paragraph is only about WARPinit. )
I hope this was helpful!
Thanks a lot for your answer!!
In conclusion
- the $WARP_1$ ~ $WARP_20$ and the $WARP_init$ all follow the same Prompt Template.
- The difference is that the initialization of the Prompt Embedding.
- The prompt length of the $WARP_init$ is also 8.
I hope my understanding is correct, thanks again for your answer!