ner-re-with-transformers-odsc2022 icon indicating copy to clipboard operation
ner-re-with-transformers-odsc2022 copied to clipboard

Building NER and RE components using HuggingFace Transformers

ner-re-with-transformers-odsc2022

Title

Transformer based approaches to Named Entity Recognition (NER) and Relationship Extraction (RE)

Session type

Workshop (hands-on)

Abstract

Named Entity Recognition (NER) and Relationship Extraction (RE) are foundational for many downstream NLP tasks such as Information Retrieval and Knowledge Base construction. While pre-trained models exist for both NER and RE tasks, they are usually specialized for some narrow application domain. If your application domain is different, your best bet is to train your own models. However, the costs associated with training, specifically generating training data, can be a significant deterrent for doing so. Fortunately, Language Models learned by pre-trained Transformers learn a lot about the language of the domain it is trained and fine-tuned on, and therefore NER and RE models based on these Language Models require fewer training examples to deliver the same level of performance. In this workshop, participants will learn about, train, and evaluate Transformer based neural models for NER and RE.

Outline

  • Background (25 mins)
    • Introduction and General Concepts
  • Named Entity Recognition (1 hour)
    • Neural and Transformer based architectures for Named Entity Recognition
    • Case Study #1: BERT based NER fine-tuned using the Groningen Meaning Bank (GMB) dataset
    • Case Study #2: Switching out BERT with DistilBERT
    • Case Study #3: XLM-RoBERTa based NER fine-tuned using HuggingFace Trainer API
  • Relationship Extraction (1 hour)
    • Neural and Transformer based architectures for Relationship Extraction
    • Case Study #4: BERT based RE (e: entity markers mention pooling) fine-tuned using New York Times dataset
    • Case Study #5: BERT based RE (b: standard mention pooling) fine-tuned using New York Times dataset
    • Case Study #6: BERT based RE (f: entity markers entity start) fine-tuned using New York Times dataset
  • Conclusion (20 mins)
    • Wrap-up
    • Q/A session

Running the Code

  • (optional) Fork this repository
  • Navigate to the Colab Web UI
  • Click on the GitHub tab
  • Enter the URL of your forked (or this) repository in the field titled "Enter a GitHub URL" and hit the Search icon
  • You should see the notebooks appear in the results. Click on the notebook you want to work with in Colab

Datasets

Additional Links