NEKO icon indicating copy to clipboard operation
NEKO copied to clipboard

Issues found in code review - series #1

Open henryj18 opened this issue 1 year ago • 1 comments

This is one of the series of issues found during the code review under the branch "add_text_modality" (some issues are inherited from the master branch), we will keep track of multiple issues found in code review and fix them in one batch (for efficiency).

Issue 1: between https://github.com/ManifoldRG/gato-control/blob/master/gato/policy/gato_policy.py#L81 and https://github.com/ManifoldRG/gato-control/blob/master/gato/policy/gato_policy.py#L113 (Inside the definition of class GatoPolicy(), we put this note here since we are not sure whether the line numbers will shift as the development goes on and more code is added/revised):

“config” is instantiated as an instance of PretrainedConfig or its subclass GPT2Config which do not have the attributes flash and gate. Need to remove the two attributes "flash" and "gate", in particular , need to remove the assignment statements with "config.flash" and "config.gate" on the left side of the assignments, such as "config.flash = flash" and "config.gate = False", etc. These two attributes and their assignments will not take effect, and will be ignored when passed into GPT2Model as attributes of the "config". The existence of these lines of code is misleading since we may think flash attention and gating layers are used when they are actually not

Issue 2: We should add an option to allow text tokens to be initialized with GPT2 pretrained embedding. Need to make decision whether this option is turned automatically if the pretrained_lm option is turned on, or a separate option should be added. Seems that we should prefer a separate option. And if this option is turned on, we should add some code after: self.embed_token = nn.Embedding(self.vocab_size, embed_dim) inside "init" of GatoPolicy() to access the portion of embedding data that are associated with text tokens (the first 50257 tokens out of the complete vocab, i.e. the size of GPT2 text tokens) and initialize that with the pretrained embedding

henryj18 avatar Sep 02 '23 11:09 henryj18