aimet
aimet copied to clipboard
Feature: Support for Transformer Module
I've enjoyed using AIMET so far. With more and more recent research using Transformer Module as part of their CNN architecture, I'm wondering if AIMET have plan to support the quantization + export of this module that is compatible with Qualcomm SNPE API?
@LLNLanLeN Thank you for your query and for being an active user of AIMET. Yes, we do plan to add support for quantization of transformers. Please do watch this space for more updates on this.
@quic-ssiddego thank you for getting back to me. Does the AIMET team have a estimated timeline when this feature will be released?
@LLNLanLeN it's under works at the moment. =). You could potentially give it a try with Hugging face PyTorch models starting with BERT uncased model (https://github.com/huggingface/transformers/blob/master/src/transformers/models/bert/modeling_bert.py). Please note that the model will need some updates to follow these rules : PyTorch Model Guidelines — AI Model Efficiency Toolkit Documentation: ver tf-torch-cpu_1.16.2 (quic.github.io). Do give it a try and let me know.
Is the transformer module supported now?
@LLNLanLeN Were you able to make it run?
@aayushahuja I didn't tried out the BERT one exactly, but over the past few months, I've been focusing on more mobile friendly Transformer implementation, and unfortunately, I haven't had much luck working with Transformer via Qualcomm SNPE on accelerated hardware (but seem to work on CPU) .