acl2019-BERT-argument-classification-and-clustering question on argument similarity task

Hi, for UKP ASPECT Corpus there are 4 labels:

DTORCD: Different Topic/Can’t decide
HS : High Similarity
NS : No Similarity 
SS : Some Similarity

Would it be beneficial to use -1,0,1,2 instead? Is this treated as a regression problem? Where is the transition takes place?

Jan 19 '20 16:01 antgr

Hi, In the experiments we mapped the labels to a binary task. Otherwise it can be difficult even for humans to decide between some and high similarity.

For modeling it as a regression task I think the labels are not fine grained enough. You would need continuous labels between 0 and 1.

Jan 23 '20 17:01 nreimers

Hello,I find there is no make_splits.py,could you give me some advices?

Dec 15 '20 08:12 XuShiqiang9894