Vishal Jain
Vishal Jain
@saraballeri, The current version of CHaiDNN doesn't have support for zc702 device as it has fewer resources. We are going to make a release soon with the support for Zynq702...
@saraballeri, Unfortunately, there is no documentation right now on CHaiDNN framework. But if you have specific questions on something, please post them and we'll try to address them in the...
Hi @tuanho27 For `sds_*` calls, you have to include correct `sds_lib.h`. Same goes for linking the sds calls. In Makefile, update the variables for aarch32. For example, ``` ARM_INC :=...
@mazhongzhong CHaiDNN-v2 design uses a batch of 2. You can run the design with one frame (Leave the other empty/un-initialized). But, In this way, you'll only utilize half of the...
@KFT-JunYS, This will be fixed with the next release (Mid-December) of CHaiDNN.
Any update on this? @yufenglee / @kunal-vaishnavi
Hey @yufenglee, I'm using original llama2: [meta-llama/Llama-2-7b](https://huggingface.co/meta-llama/Llama-2-7b), exported to ONNX using below command. ```powershell python -m onnxruntime.transformers.models.llama.convert_to_onnx -m meta-llama/Llama-2-7b-hf --output llama2-7b ``` Let me try quantizing it using command line....
@yufenglee Just tried and printed the parsed args. The `--symmetric` flag isn't getting updated to `False`. ```powershell Namespace(input_model='llama2-7b-fp32/rank_0_Llama-2-7b-hf_decoder_merged_model_fp32_opt.onnx', output_model='bw_asym/model.onnx', block_size=32, symmetric=True, accuracy_level=0, verbose=True, nodes_to_exclude=[]) ``` I'll fix this locally and...
@yufenglee I get garbage outputs with Asymmetric. 📌 Running on Windows ### With Asymmetric Quantization (block size = 32, accuracy level 0) ``` - Prompt: ONNX Runtime is - Response:...
Interestingly, If I run the same model on **Linux (Ubuntu 18.04)**, I get somewhat better results with Asymmetric model but I still see non-English sentence/words within Responses. However, the responses...