ExplainaBoard
ExplainaBoard copied to clipboard
New Task: Recognition of General BRAT Span Format
BRAT is an annotation tool that covers a wide variety of NLP analysis tasks by expressing them as annotation of spans and relations between the spans. This generalized format enables easy handling of a bunch of tasks, such as in edge probing and GLAD. It would be nice to implement this in ExplainaBoard. Here is a roadmap towards doing so, aimed at people who are getting started with ExplainaBoard so it can be a good first issue.
First, it would be a good idea to read the tutorials on implementing new tasks, features, and formats.
A first good step would be to add BRAT format input to NER as a different option in addition to CoNLL. You can start by looking at the NER Loader, and:
- [ ] Implement
FileType.brat
in addition toFileType.conll
- [ ] Write a unit test that tests the reading of BRAT format in addition to CoNLL
Once this is done, we can:
- [ ] Generalize the NER processor into a processor that can handle more general span identification tasks
A good pointer for where to do this is the conditional generation, machine translation, and summarization processors. Summarization and machine translation are specific instantiations of the general task of conditional generation, so this can give a template of a general task (span identification) and specific instantiations (NER, frame semantic parsing, chunking, etc.).
@neubig I'd like to take this up. Please could you add me as an assignee.
Hi, @divija96 thanks for your interest, just did it.
Hi @pfliu-nlp ,
@divija96 and I will be working on this issue, can you add me to the assignees as well? Thanks!
↑done