Benjamin Fineran
Benjamin Fineran
@mydhui you could try exporting a non quantized FP32 model to see if the problematic slice node is still there around this conv. Additionally, you could skip this conv during...
Hi @mammadmaheri7 can you maybe start from a clean environment? It looks like the import that it's failing on is quite old (even before 1.5) and is not up to...
Hi @SunMarc right now running compressed is WIP - we've prioritized a very flexible Q/DQ environment to enable a wide range of quantization settings and will likely roll out running...
hi @SunMarc I've updated to address your comments, specifically around the state dict load warnings and expanding the tests (note the second test case covers a sharded state dict)
hi @tsamiss deepsparse does not support dynamic quantization. Static quantization model training and export is provided in [neuralmagic/sparseml](https://github.com/neuralmagic/sparseml)
Hi @yoloyash we haven't looked into taking YOLO models to 4-bit but I do agree that this drop in accuracy is unexpected. You can try using our newer repo which...
hi @hoangtv2000 how are you exporting the model? `convert_qat` should be set to True so that the Q/DQs will be folded into fully quantize layers
Sounds like an interesting project! Yes that is an approach that would work - you could write the conversions on your own (or extend the ONNX transforms we have). alternatively...
For a basic integration, take a look at the top level readme - the key point to get started is making sure that the recipe can wrap the model, dataset,...
Hi @anberto we're currently in the middle of a large architecture change and will be updating our docs with it. In the meantime, looking at recipes in our examples may...