dostoevsky Incorrect prediction

Incorrect prediction

Open Ngoroth opened this issue 4 years ago • 2 comments

from typing import Dict from dostoevsky.models import FastTextToxicModel from dostoevsky.tokenization import RegexTokenizer

tokenizer = RegexTokenizer() toxic_model = FastTextToxicModel(tokenizer=tokenizer)

messages = [ 'привет', 'я люблю тебя!!', 'малолетние дебилы' ] results = toxic_model.predict(messages, k=2) for message, sentiment in zip(messages, results): print(message, '->', sentiment)

Output: привет -> {'normal': 0.9972950220108032, 'toxic': 0.0026416745968163013} я люблю тебя!! -> {'toxic': 1.0000100135803223, 'normal': 1.0000003385357559e-05} малолетние дебилы -> {'toxic': 1.0000100135803223, 'normal': 1.0000003385357559e-05}

я люблю тебя!! has same toxic value with малолетние дебилы

Mar 08 '20 14:03 Ngoroth

Hi, I am aware of this issue. Currently, FastTextToxicModel is not ready to use.

Mar 10 '20 10:03 dveselov

Thank you for response, i will wait impatiently

Mar 10 '20 15:03 Ngoroth

dostoevsky dostoevsky copied to clipboard

Incorrect prediction

dostoevsky
dostoevsky copied to clipboard