keras-io icon indicating copy to clipboard operation
keras-io copied to clipboard

Add Token Classification / Named Entity Recongnition (NER) Example with KerasNLP

Open awsaf49 opened this issue 1 year ago • 2 comments
trafficstars

Issue Type

Feature Request

Keras Version

Keras 3

Current Behavior?

Token Classification, also known as Named Entity Recognition (NER), is a highly popular and intriguing problem in the field of Natural Language Processing (NLP). It would greatly benefit the community to have an example on keras.io utilizing KerasNLP.

  • While there is an existing example titled Named Entity Recognition with Transformers, it employs a simplistic approach to NER. For instance, instead of utilizing a tokenizer, it employs StringLookup, and rather than utilizing a pre-trained model, it implements a basic Transformer model. While this tutorial is informative, it may not yield competitive performance.

  • Also, there was a discussion about this on https://github.com/keras-team/keras-io/pull/1291 and https://github.com/keras-team/keras-nlp/issues/927 almost a year ago, which suggested some changes to KerasNLP to include a token classification example. But those issues were closed.

Tutorial

I've recently published a notebook on Kaggle's "PII Data Detection" competition demonstrating the token classification task from scratch with KerasNLP which achieves very competitive performance. I would love to add it in keras.io. As this notebook was created for Kaggle competition, I'm open to suggestions to make it more suitable for keras.io.

I think this notebook can also serve as a potential solution to https://github.com/keras-team/keras-nlp/issues/927 as I implemented components for token classification with KerasNLP that was discussed there.

Relevant log output

No response

awsaf49 avatar Feb 17 '24 04:02 awsaf49