aimet icon indicating copy to clipboard operation
aimet copied to clipboard

Feature: Support for Transformer Module

Open LLNLanLeN opened this issue 3 years ago • 3 comments

I've enjoyed using AIMET so far. With more and more recent research using Transformer Module as part of their CNN architecture, I'm wondering if AIMET have plan to support the quantization + export of this module that is compatible with Qualcomm SNPE API?

LLNLanLeN avatar Jan 19 '22 20:01 LLNLanLeN

@LLNLanLeN Thank you for your query and for being an active user of AIMET. Yes, we do plan to add support for quantization of transformers. Please do watch this space for more updates on this.

quic-ssiddego avatar Jan 22 '22 04:01 quic-ssiddego

@quic-ssiddego thank you for getting back to me. Does the AIMET team have a estimated timeline when this feature will be released?

LLNLanLeN avatar Jan 24 '22 16:01 LLNLanLeN

@LLNLanLeN it's under works at the moment. =). You could potentially give it a try with Hugging face PyTorch models starting with BERT uncased model (https://github.com/huggingface/transformers/blob/master/src/transformers/models/bert/modeling_bert.py). Please note that the model will need some updates to follow these rules : PyTorch Model Guidelines — AI Model Efficiency Toolkit Documentation: ver tf-torch-cpu_1.16.2 (quic.github.io). Do give it a try and let me know.

quic-ssiddego avatar Jan 25 '22 03:01 quic-ssiddego

Is the transformer module supported now?

@LLNLanLeN Were you able to make it run?

aayushahuja avatar Nov 21 '22 18:11 aayushahuja

@aayushahuja I didn't tried out the BERT one exactly, but over the past few months, I've been focusing on more mobile friendly Transformer implementation, and unfortunately, I haven't had much luck working with Transformer via Qualcomm SNPE on accelerated hardware (but seem to work on CPU) .

LLNLanLeN avatar Nov 22 '22 02:11 LLNLanLeN