TENER icon indicating copy to clipboard operation
TENER copied to clipboard

Can this be used on non CONLL-2003 data format?

Open hetryn opened this issue 5 years ago • 1 comments

As above, can TENER preprocessing be done on dataset that does not follow CONLL-2003 format? My dataset does not have BIO scheme tagging. Meaning the sentences will look like this.

sentence = ['Hi', 'I', 'study', 'in', 'China', 'and', 'work' , 'in', 'ABC']
tag = ['O', 'O', 'O', 'O', 'Country', 'O', 'O', 'O', 'Company']

hetryn avatar Oct 30 '20 17:10 hetryn

Sorry for the late reply. You can re-use the TENER encoder, but the pre-processing and decoding may be suitable for your input. You can try to convert your data into the BIOES type.

yhcc avatar Nov 22 '20 14:11 yhcc