Tomás Osório
Tomás Osório
## 🚀 Feature Add transforms for data augmentation **Motivation** During training data augmentation are essential for better performances
## 🚀 Feature Calculate statistics around the processed data. **Motivation** Knowing global statistics for the processed document could be of great interest, such as the number of chars, tokens, processed...
## 🚀 Feature Allow the user to add a custom Tokenizer with minor implementation. **Motivation** It is impossible to integrate every tool, but that does not mean that it shouldn't...
If we have pipeline that run transform A, B, C and then we process doc and we create another pipeline but also add D, it should be possible to just...