ExplainaBoard icon indicating copy to clipboard operation
ExplainaBoard copied to clipboard

New Task: Recognition of General BRAT Span Format

Open neubig opened this issue 2 years ago • 4 comments

BRAT is an annotation tool that covers a wide variety of NLP analysis tasks by expressing them as annotation of spans and relations between the spans. This generalized format enables easy handling of a bunch of tasks, such as in edge probing and GLAD. It would be nice to implement this in ExplainaBoard. Here is a roadmap towards doing so, aimed at people who are getting started with ExplainaBoard so it can be a good first issue.

First, it would be a good idea to read the tutorials on implementing new tasks, features, and formats.

A first good step would be to add BRAT format input to NER as a different option in addition to CoNLL. You can start by looking at the NER Loader, and:

  • [ ] Implement FileType.brat in addition to FileType.conll
  • [ ] Write a unit test that tests the reading of BRAT format in addition to CoNLL

Once this is done, we can:

  • [ ] Generalize the NER processor into a processor that can handle more general span identification tasks

A good pointer for where to do this is the conditional generation, machine translation, and summarization processors. Summarization and machine translation are specific instantiations of the general task of conditional generation, so this can give a template of a general task (span identification) and specific instantiations (NER, frame semantic parsing, chunking, etc.).

neubig avatar Mar 22 '22 01:03 neubig

@neubig I'd like to take this up. Please could you add me as an assignee.

divija96 avatar Mar 24 '22 04:03 divija96

Hi, @divija96 thanks for your interest, just did it.

pfliu-nlp avatar Mar 24 '22 13:03 pfliu-nlp

Hi @pfliu-nlp ,

@divija96 and I will be working on this issue, can you add me to the assignees as well? Thanks!

woshiyyya avatar Mar 25 '22 15:03 woshiyyya

↑done

neubig avatar Mar 25 '22 21:03 neubig