nn_pruning icon indicating copy to clipboard operation
nn_pruning copied to clipboard

Why you say it's not needed to run the models pruned by the nn_pruning tools?

Open luofuli opened this issue 3 years ago • 1 comments

In the README.md, why did you say that "it's not needed to run the models pruned by the nn_pruning tools"?

luofuli avatar Jun 08 '21 11:06 luofuli

The nn_pruning tool remove entire heads in attention and entire rows/columns in feed forward networks. The remaining heads are then pretty dense, and the feed forward networks are completely dense after row/column removal. That means that pytorch_block_sparse is not fast enough for this slightly sparse network to be competitive with very efficient standard dense linear algebra kernels: there are not enough zeros for pytorch_block_sparse to be competitive, so just using standard pytorch functions is faster.

madlag avatar Oct 21 '21 10:10 madlag