keras-nlp icon indicating copy to clipboard operation
keras-nlp copied to clipboard

Add support for soft prompts.

Open arivero opened this issue 1 year ago • 1 comments

Is your feature request related to a problem? Please describe.

An alternative to fine tuning a whole model, or only some layers, is to fine tuning an ad-hoc prompt with new tokens. Or, if the position and extra info is irrelevant, just prepend an array of "already embedded" extra tokens.

This technique is described in the literature as "soft prompting". It allows for a very controlled counting of the number of trainable parameters, as it is just the product of the hidden vector times the number of extra tokens.

Now, training is subtle. In the first case one needs to train only part of the embedding weights and keep untouched the original ones. In the second case, one needs to add a vector than must be concatenated with the input, causing havoc in _keras_mask.

Describe the solution you'd like

An standardized way for this procedure, as it is general for almost every LLM. Not sure if it should be first form (all the model is a black box except the embedding matrix) or the second one (embedding matrix untouched, but "softprompts" must be concatenated deeper, and in the case of transformers with the option of doing it before or after summing the position embedding).

Describe alternatives you've considered

Both methods can be done by hand. In the case of fine tuning an extended embedding matrix, it can be done combining a mask with a stop_propagation in the product. In the case of additional parameters in the post-embedded, they can be added at the cost of some memory wasted along all the batch, plus rebuilt of the _keras_mask. I have not considered the use of cache for the attention, which probably is a good third alternative.

arivero avatar Mar 20 '23 09:03 arivero