albert Where is the function of Factorized Embedding Parameterization?

Where is the function of Factorized Embedding Parameterization?

Open hntee opened this issue 4 years ago • 2 comments

Hi all, I read the paper and some of the code, the paper indicates that there is a intermediate matrix ( E ) that factorizes the V -> H embedding lookup table to V -> E -> H matrices. However the code in https://github.com/google-research/albert/blob/c21d8a3616a4b156d21e795698ad52743ccd8b73/modeling.py#L199-L206 seems that the embedding is directly mapped from the input tensor.

So where is intermediate matrix V * E ? Am I missing something?

May 06 '20 16:05 hntee

I have the same question. I would like some confirmation on this but from what I understand the shapes of self.word_embedding_output, self.output_embedding_table, self.embedding_output are with respect to the embedding size. The code uses batches so I believe instead of the (V, E) matrix, you are looking for the (batch_size, seq_length, E) matrix, which is self.embedding_output. This is later input into the transformer model and projected to the hidden_size (batch_size, seq_length, H) in the following: https://github.com/google-research/albert/blob/c21d8a3616a4b156d21e795698ad52743ccd8b73/modeling.py#L1085-L1087

May 08 '20 18:05 asharma20

Thanks @asharma20 !

May 13 '20 02:05 hntee

albert albert copied to clipboard

Where is the function of Factorized Embedding Parameterization?

albert
albert copied to clipboard