deepsparse Huggingface base Wav2Vec2 model crashing

Describe the bug Hello,

I am trying to compile the onnx-converted model of a sparse Huggingface base Wav2Vec2 model (where sparsity was obtained via unstructured magnitude pruning) through compile_model :

dse_network = compile_model(onnx_filepath, batch_size=batch_size, num_cores=1, num_streams=1)

My kernel crashed and I received the following message:

Backtrace: 0# wand::detail::abort_prefix(std::ostream&, char const*, char const*, int, bool, bool, unsigned long) in /opt/conda/lib/python3.9/site-packages/deepsparse/avx512/libonnxruntime.so.1.10.0 1# 0x00007FFB125A27C4 in /opt/conda/lib/python3.9/site-packages/deepsparse/avx512/libonnxruntime.so.1.10.0 2# 0x00007FFB125A8906 in /opt/conda/lib/python3.9/site-packages/deepsparse/avx512/libonnxruntime.so.1.10.0 3# 0x00007FFB125A89F2 in /opt/conda/lib/python3.9/site-packages/deepsparse/avx512/libonnxruntime.so.1.10.0 4# 0x00007FFB125B12FA in /opt/conda/lib/python3.9/site-packages/deepsparse/avx512/libonnxruntime.so.1.10.0 5# 0x00007FFB125B1370 in /opt/conda/lib/python3.9/site-packages/deepsparse/avx512/libonnxruntime.so.1.10.0 6# 0x00007FFB11B1F76D in /opt/conda/lib/python3.9/site-packages/deepsparse/avx512/libonnxruntime.so.1.10.0 7# 0x00007FFB11B25BCF in /opt/conda/lib/python3.9/site-packages/deepsparse/avx512/libonnxruntime.so.1.10.0 8# 0x00007FFB11A92015 in /opt/conda/lib/python3.9/site-packages/deepsparse/avx512/libonnxruntime.so.1.10.0 9# 0x00007FFB11A81939 in /opt/conda/lib/python3.9/site-packages/deepsparse/avx512/libonnxruntime.so.1.10.0 10# 0x00007FFB11A82AF1 in /opt/conda/lib/python3.9/site-packages/deepsparse/avx512/libonnxruntime.so.1.10.0 11# 0x00007FFB1213F938 in /opt/conda/lib/python3.9/site-packages/deepsparse/avx512/libonnxruntime.so.1.10.0 12# 0x00007FFB121423B3 in /opt/conda/lib/python3.9/site-packages/deepsparse/avx512/libonnxruntime.so.1.10.0 13# 0x00007FFB121456B9 in /opt/conda/lib/python3.9/site-packages/deepsparse/avx512/libonnxruntime.so.1.10.0 14# 0x00007FFB11A6312B in /opt/conda/lib/python3.9/site-packages/deepsparse/avx512/libonnxruntime.so.1.10.0 15# 0x00007FFB11A6B3CE in /opt/conda/lib/python3.9/site-packages/deepsparse/avx512/libonnxruntime.so.1.10.0 16# 0x00007FFB11A11C1A in /opt/conda/lib/python3.9/site-packages/deepsparse/avx512/libonnxruntime.so.1.10.0 17# 0x00007FFB11A11ED5 in /opt/conda/lib/python3.9/site-packages/deepsparse/avx512/libonnxruntime.so.1.10.0 18# deepsparse::ort_engine::init(std::__cxx11::basic_string<char, std::char_traits, std::allocator > const&, int, std::shared_ptrwand::parallel::scheduler_factory_t) in /opt/conda/lib/python3.9/site-packages/deepsparse/avx512/libdeepsparse.so 19# 0x00007FFBE3641649 in /opt/conda/lib/python3.9/site-packages/deepsparse/avx512/deepsparse_engine.so 20# 0x00007FFBE364184B in /opt/conda/lib/python3.9/site-packages/deepsparse/avx512/deepsparse_engine.so 21# 0x00007FFBE36788B6 in /opt/conda/lib/python3.9/site-packages/deepsparse/avx512/deepsparse_engine.so 22# 0x00007FFBE364B0F9 in /opt/conda/lib/python3.9/site-packages/deepsparse/avx512/deepsparse_engine.so 23# 0x0000561F0FD79B66 in /opt/conda/bin/python

Please email a copy of this stack trace and any additional information to: [email protected]

Environment Include all relevant environment information:

OS : Ubuntu 18.04.5 LTS
Python version [e.g. 3.7]: Python 3.9.4
DeepSparse version or commit hash [e.g. 0.1.0, f7245c8]: 1.0.2
ML framework version(s) [e.g. torch 1.7.1]: torch 1.11.0
Other Python package versions [e.g. SparseML, Sparsify, numpy, ONNX]: onnxruntime 1.12.0, onnx 1.12.0,
CPU info - output of deepsparse/src/deepsparse/arch.bin or output of cpu_architecture() as follows:

>>> import deepsparse.cpu
>>> print(deepsparse.cpu.cpu_architecture())

{'L1_data_cache_size': 32768, 'L1_instruction_cache_size': 32768, 'L2_cache_size': 1048576, 'L3_cache_size': 31719424, 'architecture': 'x86_64', 'available_cores_per_socket': 19, 'available_num_cores': 38, 'available_num_hw_threads': 76, 'available_num_numa': 2, 'available_num_sockets': 2, 'available_sockets': 2, 'available_threads_per_core': 2, 'cores_per_socket': 19, 'isa': 'avx512', 'num_cores': 38, 'num_hw_threads': 76, 'num_numa': 2, 'num_sockets': 2, 'threads_per_core': 2, 'vendor': 'GenuineIntel', 'vendor_id': 'Intel', 'vendor_model': 'Intel(R) Xeon(R) Gold 6161 CPU @ 2.20GHz', 'vnni': False}

Would you please have any solution? Thank you

Aug 03 '22 14:08 Tim-blo

Hi @Tim-blo, thank you for reporting your issue.

I can take a look. To help me reproduce the problem, could you share your onnx file?

Aug 03 '22 14:08 tlrmchlsmth

Hi @tlrmchlsmth,

Thanks for the quick reply. Here is a link to my onnx file: https://platform.oroson.co/shi/2TSpOC2KBy9wfFO1GV7gRL

Aug 03 '22 15:08 Tim-blo

@Tim-blo thank you for the model! I am taking a look now.

Aug 03 '22 17:08 tlrmchlsmth

Hi @Tim-blo, I have a fix for the issue you are running into, so this will be resolved in the next nightly that goes out, and in 1.1.0.

Aug 05 '22 21:08 tlrmchlsmth

Thanks a lot!

Aug 08 '22 07:08 Tim-blo

Hi @Tim-blo, 1.1.0 is out! We tested that your model shared above works on the nightly and release. Let us know how it works for you, thanks.

Aug 26 '22 16:08 mgoin

Hey @Tim-blo have you had the chance to try the engine since then? We're looking at Wav2Vec models as a whole and would be interested in hearing your process on producing this model, thanks!

Sep 09 '22 21:09 mgoin

Hi @mgoin, I work with @Tim-blo in the same project.

When we tested it, we obtained the message that there is no implementation of Wav2Vec2PositionalConvEmbedding (GloupConv1D) on GloupConv1D After removing this layer of the original, the inference engine works correctly (with faster predictions). Do you have any news if the layer GloupConv1D is already implemented ? A second question: do you know how we could use dynamic padding on deepsparse?

Thank you

Dec 02 '22 16:12 msobroza