Ben Trevett comments

Results 90 comments of


                                            Ben Trevett

Question in fasttext

All words not in the GloVe vocabulary will have their embedding initialized to a normally distributed vector, which is what the `unk_init` does. As the GloVe embedding only contains single...

6 - Transformers for Sentiment Analysis

We don't get the CLS and SEP tokens because we use `tokenizer.tokenize` instead of `tokenizer.encode`. Ideally, I should've used `tokenizer.encode` because the BERT model expects the CLS and SEP tokens...

TypeError

Which notebook is giving you this issue?

where is the trained model parameters ?

For NLP models you can't just save the parameters, you also need to save the vocabulary and the tokenizer. I mainly wanted the tutorials to discuss how these models worked...

for word embedding in RNN model

Yes, if you copy the weights from pre-trained embeddings, then they are fine-tuned. The parameters will update as they are not frozen. If you want to freeze the embedding, then...

The train_data built from my own dataset after following the Appendix A looks wrong

The .csv file for the dataset you're linking has six columns so your `fields` list needs to have six items, one for each column. The first column is the polarity...

Representation of similar words

I'm not 100% sure of the best way of doing this. One way would to have your model simply output the tensors from the intermediate layers, i.e. `return self.fc(cat), cat,...

Add Attention to the classification models

If you know of any papers/models that have an attention based model for sequence classification, let me know and I can look into adding those. Usually attention is used for...

Add Attention to the classification models

That model is for sequence-to-sequence learning, not sequence classification.

Add Attention to the classification models

Thanks for the link. Will check it out.