François Lagunas
François Lagunas
The nn_pruning tool remove entire heads in attention and entire rows/columns in feed forward networks. The remaining heads are then pretty dense, and the feed forward networks are completely dense...
Hello @HamidShojanazeri, Not 100% sure, but this is probably because the GPU has not enough computation to do at once to show a significant difference. Try increase the batch size,...
Yes, it's described in the upcoming EMNLP 21 paper, available on arxiv too: https://arxiv.org/abs/2109.04838 .
Hello, That's cool, I have not tested on NER right now, it will be interesting to check it. Yes, I will expand the steps into a real example, so you...
Hi Jules ! I pushed a new branch "madlag_fix", but so I could not test it completely: I tried to run the experiment, but I lack the dataset files, and...
Hi ! Yes, [#5](https://github.com/huggingface/nn_pruning/issues/5) contains useful stuff for you too. You will need to adjust the sparse parameters to adjust sparsity. When everything goes well, some heads are pruned, and...