yoyodyne
yoyodyne copied to clipboard
Shared embeddings
We currently store separate vocabularies for source, target, and features, and have separate embeddings matrices for the encoder and the decoder. We propose to flatten this distinction the following way:
- Keep a single vocabulary.
- If the user requests a non-shared embedding matrix, keep the source and target embeddings separate by enclosing the source symbols
{like}
{this}
(cf. what we do with the features vocabulary[like]
[this]
). - When the vocabulary is created, continue to log the three vocabularies separately---this is useful debugging information. However, this can be moved into the vocabulary constructor itself.
- Store a single embedding matrix in the
Model
(not theModule
); either delegate embedding lookup to theModel
or pass a reference to the matrix when the modules are constructed. (There may be some tricks with initializing the parameters since some models are opinionated about this, but nothing that can't be solved here.)