Aflah

Results 125 comments of Aflah

I can work on this if no one else is taking this up!

Hey @blackhat-coder, are you still working on this?

This is also partly inspired by the ideas mentioned in the [GSOC Document](https://docs.google.com/document/d/1fLDLwIhnwDUz3uUV8RyUZiOlmTN9Uzy5ZuvI8iDDFf8/edit#)

@mattdangerw and rest of the Keras-Team would be great to hear your thoughts on this

As a starting point I've implemented EDA whilst also fixing some of the bugs which are present in the original EDA code such as not excluding stop words in some...

I've implemented it seems I'm getting the almost 3% gains mentioned in the paper https://github.com/aflah02/Easy-Data-Augmentation-Implementation/blob/main/EDA.ipynb What should be the next step now? @mattdangerw or anyone else from the Keras Team

@mattdangerw Sure! I had quite a bit of fun while trying to implement this too, while you guys figure out the way you'd prefer I'll try implementing other techniques too

Backtranslation done on smaller sample size also seems pretty good at giving results: https://github.com/aflah02/BackTranslation-Based-Data-Augmentation Maybe we could do this using parallelized processes to make it faster as for large datasets,...

I recently came across this paper [SSMBA: Self-Supervised Manifold Based Data Augmentation for Improving Out-of-Domain Robustness](https://aclanthology.org/2020.emnlp-main.97.pdf) which uses a corruption and reconstruction function to recreate new samples from the real...

Another great paper [(Synthetic and Natural Noise Both Break Neural Machine Translation)](https://arxiv.org/pdf/1711.02173.pdf) which aims to make NMT models more robust to typos and other corruptions which humans can easily overcome....