bert-sklearn
bert-sklearn copied to clipboard
Is there any plan to support multi-label classification task?
According to the subject.
Hi. I was not thinking of adding a multi-label classification task. I can look into it though. Is there a particular open source NLP dataset you are thinking about?
Hi. I was not thinking of adding a multi-label classification task. I can look into it though. Is there a particular open source NLP dataset you are thinking about?
Hi charles9n. Your repo is great so I want to use this repo on Kaggle's Toxic Comment Classification Challenge. I think maybe I can try to add a multi-label feature to this repo if I have time recently.
Thankyou. That sounds like a great idea that will be useful to others as well.
Hello,
I am interested in this as well. Was wondering if there has been any progress? Or workarounds.
I haven't heard anything. Let's see what EvanMu96 thinks...
There was a great medium blog post at the beginning of the year on this though: https://medium.com/huggingface/multi-label-text-classification-using-bert-the-mighty-transformer-69714fa3fb3d
It shouldn't be too big a change to add it here. Mainly it would be changing out the loss function I think.
But if you are needing it now, the author of the medium post went on to create a really nice repo with multi-label classification. You can check it out at: https://github.com/kaushaltrivedi/fast-bert
Thank you for the references. Those certainly helps. I am experimenting with OnevsRest classifier as a workaround for now.
This project is nice I want to see it have this feature! @charles9n @EvanMu96 @mividalocas . I'm attempting to fork it and add multi-label support. Got through configuration (model.multilabel = True
) and the addition of a toxic comments test. But running into tensor shape issues and I'm not very good at torch.
See this explanation for changing the final activation layer to support multi-label: https://dejanbatanjac.github.io/2019/07/04/softmax-vs-sigmoid.html
You can start the toxic comments test with: python -m pytest -sv tests/test_bert_sklearn_multilabel.py
... but I'm stuck at Expected input batch_size (8) to match target batch_size (48)
Just pushed my fork: https://github.com/Shane-Neeley/bert-sklearn
Hey @Shane-Neeley have you made any progress on this? If it's still broken I can look at your fork.
Hi @brandomr .. got sidetracked (making a COVID-19 project like everyone else) .. yes if you can, please take a look. I think it's almost there. I've changed 7 files here https://github.com/Shane-Neeley/bert-sklearn/commit/0b8a3f642046991245b033501a40cc918a9118f2
And you can run the test with python -m pytest -sv tests/test_bert_sklearn_multilabel.py
Any update on this? I'm interested in this as well.