bert_score
bert_score copied to clipboard
Semantic similarity between essays and a theme
Hi guys, thanks for this fantastic project.
I intend to use it to measure the similarity between essays written by students and a given theme. The theme is a one-line sentence and each essay has a couple of paragraphs. I have a dataset where essays written in conformation to the theme have a positive score ranging from 20 - 200 while essays that don't consider the theme receive 0.
From what I've glanced at the original article, and played around using the relevant pre-trained bert model in opposition to the default language, it might be a very doable thing, although not perfect. I still have some doubts about how to use the weighting, which I hope will improve the measurements I expect to get.
Anyhoo, any advice on how to approach this task? Please, any do's or don'ts are welcomed 😃