sparseml icon indicating copy to clipboard operation
sparseml copied to clipboard

speed

Open aimen123 opened this issue 2 years ago • 6 comments

the size of pruned and not pruned is same ---> 14.4M the spedd of that is same ,too What do I need to do to make it go fasteri in my computer?

aimen123 avatar Apr 29 '22 03:04 aimen123

do you have a code to simplified model ?

aimen123 avatar Apr 29 '22 04:04 aimen123

Hey @aimen123. Could you provide more details to help us identify the problem?

Which model are you referring to? Could you give us the name or a link? Is it one of our models from SparseZoo (https://sparsezoo.neuralmagic.com/)? How are you running inference (are you running on GPU, CPU, or on CPU using our DeepSparse Engine?)

dbogunowicz avatar Apr 29 '22 05:04 dbogunowicz

@dbogunowicz the code is Quantization-Aware Training? why 2 epoch in the yolov5s.pruned_quantized.md

aimen123 avatar Apr 29 '22 07:04 aimen123

It would really help us if you could adhere to the bug reporting standards described here: https://github.com/neuralmagic/sparseml/blob/main/CONTRIBUTING.md

For bugs, include:

brief summary
OS/Environment details
steps to reproduce (s.t.r.)
code snippets, screenshots/casts, log content, sample models
add the GitHub label "bug" to your post

This way we should be able to troubleshoot more efficiently.

Thank you!

dbogunowicz avatar Apr 29 '22 07:04 dbogunowicz

Hi @aimen123, the PyTorch weight files and ONNX export files will remain the same size for pruning only. This is because the pruning is done in an unstructured way meaning that we are introducing 0's into the weight matrices rather than changing their shape. To realize the unstructured pruning for file sizes, you'll need to run a compression algorithm over the weights which will result in the smaller files.

For example, try the following on a linux machine: tar -czvf model.onnx.tar.gz model.onnx

markurtz avatar May 02 '22 16:05 markurtz

Hi @aimen123, the PyTorch weight files and ONNX export files will remain the same size for pruning only. This is because the pruning is done in an unstructured way meaning that we are introducing 0's into the weight matrices rather than changing their shape. To realize the unstructured pruning for file sizes, you'll need to run a compression algorithm over the weights which will result in the smaller files.

For example, try the following on a linux machine: tar -czvf model.onnx.tar.gz model.onnx

ok

aimen123 avatar May 13 '22 08:05 aimen123

Closing out this older issue as there has been no further activity. Regards, Jeannie / Neural Magic

jeanniefinks avatar Feb 01 '23 19:02 jeanniefinks