SPM_toolkit
SPM_toolkit copied to clipboard
Clarifications regarding ESIM Model code
Hi
I have a few questions related to the ESIM model for NLI:
- The ESIM paper says all vectors are updated during training including word vectors. Do the pretrained word embeddings from Glove also get updated in your ESIM model code?
- The file at ESIM/main_batch_snli.py uses only ESIM model while the file at ESIM/Tree_IM/main_snli.py uses both ESIM and Tree LSTM Models, right?
- The num_units in ESIM LSTM is always equal to pretrained embeddings dimension. Is it a necessity? I got dimension errors while trying to change num_units in LSTM.
- There is a default parameter max_sentence_length = 30 but it isn't used anywhere in the model_batch.py file. Is there any significance for this? I thought max_len is the parameter that controls sequence length and can be modified in the main_snli.py file.
It would be great if you could clear these doubts. Thanks!
Hello, here is my answer:
- yes. They are updated during training.
- Tree_IM just replaces original LSTM modules into Tree LSTM, so I reuse many code in original ESIM.
- LSTM has input_size (The number of expected features in the input x) and hidden_size (The number of features in the hidden state h), the first one input_size should be equal to word embedding dim.
- max_sentence_length = 30 is used in model training, where longer sentences (>30) will be skipped.
Best, Wuwei