keras-io
keras-io copied to clipboard
Add Token Classification / Named Entity Recongnition (NER) Example with KerasNLP
Issue Type
Feature Request
Keras Version
Keras 3
Current Behavior?
Token Classification, also known as Named Entity Recognition (NER), is a highly popular and intriguing problem in the field of Natural Language Processing (NLP). It would greatly benefit the community to have an example on keras.io utilizing KerasNLP.
-
While there is an existing example titled Named Entity Recognition with Transformers, it employs a simplistic approach to NER. For instance, instead of utilizing a
tokenizer, it employsStringLookup, and rather than utilizing a pre-trained model, it implements a basicTransformermodel. While this tutorial is informative, it may not yield competitive performance. -
Also, there was a discussion about this on https://github.com/keras-team/keras-io/pull/1291 and https://github.com/keras-team/keras-nlp/issues/927 almost a year ago, which suggested some changes to KerasNLP to include a token classification example. But those issues were closed.
Tutorial
I've recently published a notebook on Kaggle's "PII Data Detection" competition demonstrating the token classification task from scratch with KerasNLP which achieves very competitive performance. I would love to add it in keras.io. As this notebook was created for Kaggle competition, I'm open to suggestions to make it more suitable for keras.io.
I think this notebook can also serve as a potential solution to https://github.com/keras-team/keras-nlp/issues/927 as I implemented components for token classification with KerasNLP that was discussed there.
Relevant log output
No response