DeepPavlov
DeepPavlov copied to clipboard
Finding Cuss Words with NER
What problem are we trying to solve?:
Training the existing Bert Model to identify cuss words and in future even deal with sarcasm
How can we solve it?:
By using a semantics of words or using google word2vec and creating vector for identifying and training the model.
Are there other issues that block this solution?:
Easy to identify cuss words in english, but the same word might mean different thing in other languages. Need to train a model which incorporates this.
Hey @potato-patata , would like to work on this issue. Can we do this translation using the googletrans library and then applying the semantics of words for training the model?
Hey @rashmiprabhat567 we can definitely use it, but we need to find a way to incorporate with dp embeddings. So it is better if we first look into dp embeddings and then proceed.
Hey @rashmiprabhat567 we can definitely use it, but we need to find a way to incorporate with dp embeddings. So it is better if we first look into dp embeddings and then proceed.
got it @potato-patata . I'm looking over to solve this issue. will let you know if I can make any progress
hey @potato-patata ! Is this still open to work on?
Hi, this issue is not for gsoc task. This is just an enhancement suggestion.
Moreover, I am not the mentor so I apologise if I might have mislead you 😄
Hi, @potato-patata! Sorry for the late response. We already have a model that allows to determine toxicity in texts, there is also a pull request where a new emotion classifier has been added.