snac icon indicating copy to clipboard operation
snac copied to clipboard

Training with attention or not

Open Naminwang opened this issue 8 months ago • 0 comments

Hi, when i see the config on hugging face for model predict, the attn_window_size is null, so i wonder if the attention is used in training state? And, can you share some training details, some thing like lr, the size of training data...

Naminwang avatar Jun 11 '24 03:06 Naminwang