Vishal Jain comments

Results 15 comments of


                                            Vishal Jain

Build hardware for zc702

@saraballeri, The current version of CHaiDNN doesn't have support for zc702 device as it has fewer resources. We are going to make a release soon with the support for Zynq702...

Build hardware for zc702

@saraballeri, Unfortunately, there is no documentation right now on CHaiDNN framework. But if you have specific questions on something, please post them and we'll try to address them in the...

Build hardware for zc702

Hi @tuanho27 For `sds_*` calls, you have to include correct `sds_lib.h`. Same goes for linking the sds calls. In Makefile, update the variables for aarch32. For example, ``` ARM_INC :=...

How to process one image at a time with the ChaiDNN v2?

@mazhongzhong CHaiDNN-v2 design uses a batch of 2. You can run the design with one frame (Leave the other empty/un-initialized). But, In this way, you'll only utilize half of the...

Concat layer is not working.

@KFT-JunYS, This will be fixed with the next release (Mid-December) of CHaiDNN.

Incorrect/Garbage Responses for Llama-2-7b-hf with INT4 GPTQ/RTN Asymmetric Quantization

Any update on this? @yufenglee / @kunal-vaishnavi

Incorrect/Garbage Responses for Llama-2-7b-hf with INT4 GPTQ/RTN Asymmetric Quantization

Hey @yufenglee, I'm using original llama2: [meta-llama/Llama-2-7b](https://huggingface.co/meta-llama/Llama-2-7b), exported to ONNX using below command. ```powershell python -m onnxruntime.transformers.models.llama.convert_to_onnx -m meta-llama/Llama-2-7b-hf --output llama2-7b ``` Let me try quantizing it using command line....

Incorrect/Garbage Responses for Llama-2-7b-hf with INT4 GPTQ/RTN Asymmetric Quantization

@yufenglee Just tried and printed the parsed args. The `--symmetric` flag isn't getting updated to `False`. ```powershell Namespace(input_model='llama2-7b-fp32/rank_0_Llama-2-7b-hf_decoder_merged_model_fp32_opt.onnx', output_model='bw_asym/model.onnx', block_size=32, symmetric=True, accuracy_level=0, verbose=True, nodes_to_exclude=[]) ``` I'll fix this locally and...

Incorrect/Garbage Responses for Llama-2-7b-hf with INT4 GPTQ/RTN Asymmetric Quantization

@yufenglee I get garbage outputs with Asymmetric. 📌 Running on Windows ### With Asymmetric Quantization (block size = 32, accuracy level 0) ``` - Prompt: ONNX Runtime is - Response:...

Incorrect/Garbage Responses for Llama-2-7b-hf with INT4 GPTQ/RTN Asymmetric Quantization

Interestingly, If I run the same model on **Linux (Ubuntu 18.04)**, I get somewhat better results with Asymmetric model but I still see non-English sentence/words within Responses. However, the responses...