Benjamin Fineran

Results 30 comments of Benjamin Fineran

@mydhui you could try exporting a non quantized FP32 model to see if the problematic slice node is still there around this conv. Additionally, you could skip this conv during...

Hi @mammadmaheri7 can you maybe start from a clean environment? It looks like the import that it's failing on is quite old (even before 1.5) and is not up to...

Hi @SunMarc right now running compressed is WIP - we've prioritized a very flexible Q/DQ environment to enable a wide range of quantization settings and will likely roll out running...

hi @SunMarc I've updated to address your comments, specifically around the state dict load warnings and expanding the tests (note the second test case covers a sharded state dict)

hi @tsamiss deepsparse does not support dynamic quantization. Static quantization model training and export is provided in [neuralmagic/sparseml](https://github.com/neuralmagic/sparseml)

Hi @yoloyash we haven't looked into taking YOLO models to 4-bit but I do agree that this drop in accuracy is unexpected. You can try using our newer repo which...

hi @hoangtv2000 how are you exporting the model? `convert_qat` should be set to True so that the Q/DQs will be folded into fully quantize layers

Sounds like an interesting project! Yes that is an approach that would work - you could write the conversions on your own (or extend the ONNX transforms we have). alternatively...

For a basic integration, take a look at the top level readme - the key point to get started is making sure that the recipe can wrap the model, dataset,...

Hi @anberto we're currently in the middle of a large architecture change and will be updating our docs with it. In the meantime, looking at recipes in our examples may...