Thilina Rajapakse

Results 57 comments of Thilina Rajapakse

For most transfer learning tasks, you would usually freeze the earlier layers. But in the case of BERT and other derivatives, the approach is to fine-tune all parameters, albeit for...

Hard to say. 100k samples should be enough to train the model. Can you try with the Simple Transformers library linked in the readme as this repo is out of...

The data might be being loaded from the cache dir. Try deleting any cached files. Do you have the same issue when using the yelp data?

Try using the Yelp dataset as given in the guide. It's impossible to say what the issue is without seeing your data. Or, consider using [Simple Transformers](https://github.com/ThilinaRajapakse/simpletransformers) as it is...

Set this in the args dict. `'output_mode': 'regression'`

_Easiest_ way to do it would probably be something like this. I am setting `label` to 0 for all the examples, but the labels will not be used. ``` def...

It's certainly possible. It's original purpose was to give insight into examples that the model was getting wrong.

How about the approach [here](https://github.com/malteos/pytorch-bert-document-classification/blob/master/models.py) (ExtraBertMultiClassifier)? ``` class ExtraBertMultiClassifier(nn.Module): def __init__(self, bert_model_path, labels_count, hidden_dim=768, mlp_dim=100, extras_dim=6, dropout=0.1): super().__init__() self.config = { 'bert_model_path': bert_model_path, 'labels_count': labels_count, 'hidden_dim': hidden_dim, 'mlp_dim': mlp_dim, 'extras_dim':...

This may be caused by an update to the Hugging Face Transformers library. Please consider using [Simple Transformers](https://github.com/ThilinaRajapakse/simpletransformers) as this repo may not be compatible with the current Hugging Face...

I can't reproduce this. Are you using the latest version of the repo?