keras-nlp [Training] Track the progress of BERT Training

[Training] Track the progress of BERT Training

Open chenmoneygithub opened this issue 2 years ago • 3 comments

This is an issue for tracking the progress of training BERT example. The model has different sizes: tiny, small, base and large. Only tiny and small fit in a common GPU on GCP. For base and large, we have to move them to TPU for training, or use ParameterServerStrategy.

May 09 '22 22:05 chenmoneygithub

Training config:

MODEL NAME	NUM LAYERS(L)	HIDDEN SIZE(H)	NUM HEADS(A)	BATCH_SIZE	NUM TRAIN STEPS
BERT SMALL	4	512	8	256	50000

Training Stats:

loss	lm_loss	nsp_loss	lm_accuracy	nsp_accuracy
3.2712	3.1801	0.0911	0.3717	0.9648

GLUE evaluation:

mrpc	cola
71.13	0

May 10 '22 19:05 chenmoneygithub

Training config:

MODEL NAME	NUM LAYERS(L)	HIDDEN SIZE(H)	NUM HEADS(A)	BATCH_SIZE	NUM TRAIN STEPS
BERT SMALL	4	512	8	256	500000

GLUE evaluation:

mrpc
73.10

Platform: GCE + GPU

May 17 '22 22:05 chenmoneygithub

We fixed some initialization issues, and reran the experiment, it gets closer to the official reported score.

Training config:

MODEL NAME	NUM LAYERS(L)	HIDDEN SIZE(H)	NUM HEADS(A)	BATCH_SIZE	NUM TRAIN STEPS
BERT SMALL	4	512	8	256	500000

GLUE evaluation:

mrpc
75.01

Platform: Cloud TPU

May 21 '22 22:05 chenmoneygithub

keras-nlp keras-nlp copied to clipboard

[Training] Track the progress of BERT Training

keras-nlp
keras-nlp copied to clipboard