Benjamin Fineran
Benjamin Fineran
tested against: * CPU, GPU, FP32, FP16 * Zoo and local models * base and layer-dropped models
supports loading a recipe and model from sparsezoo, applying that recipe to the model, and then possibly converting it to a quantized torch model to run on CPU. `torch.quantization.convert` has...
default saved epoch for `one_shot` in the IC flows is `-1` due to `Trainer` initialization. This will cause issues on model load since the checkpoint recipe will be initialized to...
currently in runs of composed staged recipes, modifier finalization only occurs at the end of the entire run (after all stages). this may cause issues because when a stage is...
in collaboration with @anmarques goal of this PR is to add a pass to emulate the INT32 quantization of a FC layer's bias add to accurately match what happens during...
README for `deepsparse.license` tool proposed in #630. @jeanniefinks and Rob G to complete TODOs
a core feature of the QuantizationModifier refactor is the ability for users to have both more simple and more fine grained control over how quantization is applied at large and...
first PR for QuantizationModifier refactor. moves the existing modifier and tests into a "legacy" file. Creates a template object for the new modifier. To maintain backwards compatibility, we add support...
scales and zero points were not accounting for correct groups when iterating over the input channel dimension