unilm
unilm copied to clipboard
Information about implementation of BeiTv2
Describe Hi, I would like to know if layer scale is used (at 0.1) in finetuning BeiTv2 on the classification task. From the code point of view it seems that layer_scale_init_value is at 0.1 but in the paper I didn't find any references inside Table 7 ('Hyperparameters for Image Classification Fine-tuning'). Thanks in advance