NeMo
NeMo copied to clipboard
[NLP/PC] add support for capitalization classes lower (L), upper (U), capitalize (C)
Signed-off-by: Iztok Lebar Bajec [email protected]
What does this PR do ?
While waiting for #3819 to be finished, upgrade of current punctuation capitalization model with support for lowercasing, uppercasing, and capitalisation.
Collection: NLP/PC
Changelog
Modified the capitalisation decision from pure capitalisation, as soon as capit_label differs from noop (O), to one where the operation is based on three classes: lowercase (L), uppercase (U) and capitalize (C).
Warning: Due to the class_label previously used for capitalisation (U) and the way the decision was implemented prior to this PR, this change becomes a breaking change. Models trained prior to this PR will result in returning all caps instead of capitalising selected words. A retrain will, however, provide additional functionality.
Usage
- You can potentially add a usage example below
PR Type:
- [x] New Feature
- [ ] Bugfix
- [ ] Documentation
Who can review?
@PeganovAnton