deepsparse icon indicating copy to clipboard operation
deepsparse copied to clipboard

Sparsity-aware deep learning inference runtime for CPUs

Results 85 deepsparse issues
Sort by recently updated
recently updated
newest added

tested against: * CPU, GPU, FP32, FP16 * Zoo and local models * base and layer-dropped models

supports loading a recipe and model from sparsezoo, applying that recipe to the model, and then possibly converting it to a quantized torch model to run on CPU. `torch.quantization.convert` has...

Adding a Lambda deployment to the examples directory. This is very similar to the sagemaker deployment. The scope of this application encompasses the automation of the: 1. Construction of a...

Hello, I am keen to convert my quantized trained ONNX model into a blob file. OpenVino currently does not support this which is what I've been using so far. Is...

enhancement

Note: Not integrated into server yet. Main hook is `start_file_watcher` for server to call into to kick off a watcher process. Everything else is just helpers for that. The file...

mle-team

**Describe the bug** As in the title, setting `-ncores` with `-s async` uses more cores than set with `-ncores`. For example, with `deepsparse.benchmark oBERT-MobileBERT_14layer_50sparse_block4_qat.onnx -e onnxruntime -ncores 8 -s async`...

bug

**Is your feature request related to a problem? Please describe.** Usage under windows 10. **Describe the solution you'd like** Support for Windows 10.

enhancement

README for `deepsparse.license` tool proposed in #630. @jeanniefinks and Rob G to complete TODOs

mle-team

This shows users how to use the `/deployment` directory of a model inside docker. Test plan: Run example docker build command from readme

mle-team