Adding custom regex for custom labels
Describe the outcome you'd like:
I would like to add custom regex patterns corresponding to custom labels. For instance, I would add a regex to recognize German passport numbers, and add the corresponding GERMAN_PASSPORT label.
Is this possible at the moment? Or is it a feature you have on your roadmap?
Additional context:
Regex models are implemented in the code, but I see no obvious way of adding new regex patterns. In your documentation, the emphasis is on training and adding new neural networks, but adding custom regex detection would be a much simpler way to customize and extend labeling.
We allow for any model to be added and used for label detection. Our main focus is utilizing deep learning to enhance the detection beyond regex capabilities for more complex tasks.
We do include a regex model in the repo which can could be imported and updated if desired. One would have to manually add their own regex model parameters for new labels to be detected. It is on the roadmap to add an example of the regex model. I'll talk with @lettergram about pushing that out sooner rather than later.
Thanks, an example would be perfect.
@ian-contiamo Can you check out the new example to see if this meets your needs?
Thanks @JGSweets, it's obviously of great help!
I'm still wrapping my head around how you do things, and I will try to figure out how to add new labels rather than replace the existing ones altogether.