Matt Watson
Matt Watson
@howl-anderson are you currently working on this?
Definitely, it would be great to start adding explainability tools here! I was actually just wishing we had precisely this while working on a guide for keras-nlp. Would you be...
I think the issue here is with the residual. The inputs are summed with the MHA output, and that output is summed with FF output. That effectively enforce the constraint...
Assigning myself as a placeholder, I believe we may already have some people to work on this.
No strong preference from me for the model. This guide will be more about tokenizers and greedy_search than the model itself. LSTM could be nice and simple, but if you...
Development instructions for contributing examples are [here](https://github.com/keras-team/keras-io#adding-a-new-code-example). Examples should be short
Thank you! Please tag me for review!
@chenmoneygithub know you are already looking at this, just opening an issue so I can reference it elsewhere!
Can you explain more of how this will work? This will be a layer that randomly deletes stop words? Or deterministically drops stopwords?
Is this issue specifically for the BERT example?