BERT-keras number of trainable parameters

number of trainable parameters

Open andrey999333 opened this issue 6 years ago • 1 comments

I don't quite understand one point. When I downloaded your keras representation of BERT and check the number of trainable parameters in summary, it showed ~177 mil parameters, while in official bert it should be 110 mil for base model. Could you explain where this difference comes from?

Feb 02 '19 07:02 andrey999333

Hi, I'm not entirely sure, but maybe it's because of the subword embeddings? most of the time people don't count input embeddings in their model parameters.

Feb 02 '19 07:02 Separius

BERT-keras BERT-keras copied to clipboard

number of trainable parameters

BERT-keras
BERT-keras copied to clipboard