unilm icon indicating copy to clipboard operation
unilm copied to clipboard

About Pretraining on Diff Attention

Open ghada-soliman opened this issue 10 months ago • 1 comments

Hello Team,

I would like to ask about your recommendation for the dataset used for pretraining the Diff Attention model.

Thank you.

ghada-soliman avatar Feb 23 '25 08:02 ghada-soliman

Hi, our training corpus follow StableLM https://aka.ms/StableLM-3B-4E1T You can also use any datasets you like to train and compare Diff with baseline Transformer, the results should be similar.

YTianZHU avatar Mar 03 '25 04:03 YTianZHU