WordPiece, Sentencepiece, Refactor, Correct configurations
Open
nglehuy
opened this issue 3 years ago
•
0 comments
Features
- Another
text_featurizer using wordpiece from tensorflow_text.FastWordpieceTokenizer
- Update
SentencepieceFeaturizer using tensorflow_text.FastSentencepieceTokenizer
- Add
tf_extract function in text_featurizer to support dataset on TPUs with use_tf: True option
- Add
jit_compile option in model's compile (for faster fixed-shape training using XLA)
- Add gaussian weight noise in transducer decoder, wrapper function to apply and remove weight noises
- Add convolution blur pool
Fixes
- Refactor code (correct models, functions, unittest, configs, ...)
- Update
ASRDatasets to support text-featurizer independent tfrecords (create tfrecords only with audio and transcript, instead of audio and indices)
- Replace some
experimental options with their official supports
- Drop support for
tensorflow < 2.8 (for older versions, please use TensorFlowASR ~= v1.x)
- Remove unused dependencies