addons
addons copied to clipboard
Add Mixout module
It's an issue switched from tf repo (here) and here is the pending pr I've sent there. According to the reviewer's suggestion, I probably should add the module to here first.
Describe the feature and the current behavior/state. Mixout is a module proposed here. In short, it resembles dropout, but rather than setting the randomly selected weights to zero, it replaces them with the weights in the pre-trained model. By doing so it helps to improve the stability in downstream fine-tuning tasks.
Will this change the current api? How? Yes, it would require a new API like tf.nn.mixout with similar signature with tf.nn.dropout
Who will benefit with this feature? People who wanna use BERT in downstream tasks with small datasets. This feature (as claimed in the paper) improve stability.
Any Other info. A pytorch version has been provided by the author.
Relevant information
-
Are you willing to contribute it: yes
-
Are you willing to maintain it going forward? yes
-
Is there a relevant academic paper? yes, here
-
Is there already an implementation in another framework? there is a pytorch version provided by the author, yet I don't think it's merged in the framework.
-
Was it part of tf.contrib? (if so, where): no
Which API type would this fall under (layer, metric, optimizer, etc.)
custom_ops (since it's categorized under tensorflow/python/ops/nn_ops), yet I'm not sure which folder I shall add it to (among activation/layer/image/seq2seq/text)
Sounds great! Free feel to file an PR. Also, our style is a little bit different from the one of core TF so please take a look at https://github.com/tensorflow/addons/blob/master/CONTRIBUTING.md. But no worries, if you find any problem on testing suite or style, just opening an PR first and pinging me. Thank you.
BTW, I would say we can place it in layers but want to see other members' opinion @tensorflow/sig-addons-maintainers.
BTW, I would say we can place it in layers but want to see other members' opinion
+1, it would be better if we create a new subclass for it, eg: Dropout
Same thought. It would be much better if this is implemented as a layer.
Hey @AakashKumarNain @facaiy @WindQAQ thanks for the suggestions! I'm working on the layer version but then realize it's actually hard since Mixout requires to manipulate the weight in the "previous" layers (these lines may make it clearer). I'm thinking if there is any approach we can make it into a layer wrapper or some sort of callback, so that it can access the weight by self.trainable etc.
Hey @AakashKumarNain @facaiy @WindQAQ thanks for the suggestions! I'm working on the layer version but then realize it's actually hard since
Mixoutrequires to manipulate the weight in the "previous" layers (these lines may make it clearer). I'm thinking if there is any approach we can make it into a layer wrapper or some sort of callback, so that it can access the weight byself.trainableetc.
Haven't taken a deeper look into the implementation, but we can subclass tf.keras.layers.Wrapper to access the weight of wrapped layer.
https://www.tensorflow.org/api_docs/python/tf/keras/layers/Wrapper
For tfa.nn submodule, we had some discussion long time ago https://github.com/tensorflow/addons/issues/426. Not sure if it's worthwhile now.
awesome will look into that! Great thanks!
Hi @crystina-z Sorry to bother, but is there any update regarding this? Would be of a great help for my current project. Thank you in advance.
TensorFlow Addons is transitioning to a minimal maintenance and release mode. New features will not be added to this repository. For more information, please see our public messaging on this decision: TensorFlow Addons Wind Down
Please consider sending feature requests / contributions to other repositories in the TF community with a similar charters to TFA: Keras Keras-CV Keras-NLP