LoRA
LoRA copied to clipboard
How to compute that GPT-2 M (FTTop2 ) trainable parameters number is 25.19M?
in the table 3 of the paper "LORA: LOW-RANK ADAPTATION OF LARGE LAN- GUAGE MODELS" it says that finetuning top2 layer of GPT2 meidium require 25.19M trainable parameters. how to get this number? I can't find that in its original paper "Prefix-Tuning: Optimizing Continuous Prompts for Generation" and confused about how to get it.