sparseml
sparseml copied to clipboard
Added RigL pruning modifier
Implementation of the RigLPruningModifier - a sparse training procedure proposed in the paper https://arxiv.org/abs/1911.11134.
Description
This sparse training procedure performs sparse training of the model with periodic pruning if a certain fraction of weights with smallest magnitude and regrowth (unmasking) of the same fraction according to the magnitude of the gradients, i.e zero weights with large gradients become trainable again.
This modifier implements the three sparsity distribution strategies proposed in the original paper:
- Uniform
- Erdos-Renyi
- Erdos-Renyi Kernel
Usage
This modifier requires a GradSampler (similarly to OBSPruningModifier) and collects a single gradient for score estimation.