cccpr comments

Results 67 comments of


                                            cccpr

llama2-7b bad results for int8-kv-cache + per-channel-int8-weight

@Tracin How to build TensorRT engine，using the files created by ammo-w8-a8-smoothquant？ I can not see any docs.

llama2-7b bad results for int8-kv-cache + per-channel-int8-weight

@Tracin for weight-only issue, you mentioned "make the build option aligned", which option are you refering to?

llama2-7b bad results for int8-kv-cache + per-channel-int8-weight

@Tracin Many LLM-quantization papers(for example, [this paper](https://arxiv.org/pdf/2308.15987.pdf) have stated that LLama2-7b-w8a8-smoothquant accuracy **is close to fp16 accuracy on MMLU** (including myself have done some experiments in my own codes, the...

llama2-7b bad results for int8-kv-cache + per-channel-int8-weight

@Tracin You can check the comments in this issue I have already wrote, I have already used --per_channel --per_token

llama2-7b bad results for int8-kv-cache + per-channel-int8-weight

@Tracin thanks for the effort. You mentioned that bad acc on int8-kv is not reproduced. Can you share your tensorrt-llm version and running commands?

llama2-7b bad results for int8-kv-cache + per-channel-int8-weight

@Tracin - My TensorRT-LLM version is 0.7.1, and I followed the modifications you mentioned below, **but still get 37.6 for w8a8 smoothquant acc on mmlu.** So there are some other...

How to deep copy checkpoint.model ?

@already-taken-m17 @jcjohnson After training a model, I find the training epoch seems not enough, so I reload it and try to finetune(retrain), but why the training loss seems like the...

Train on other dataset

@ruotianluo training on my own dataset. The attention looks like this： [image](https://user-images.githubusercontent.com/13804492/31314100-38608414-ac2a-11e7-8cdf-19874b596746.png) any idea why the attention looks so weird?

Train on other dataset

@ruotianluo the caption results are fine. The dataset is quite small(less than 1000 images), and the vocabulary is less than 100 words

Train on other dataset

@ruotianluo and after training for more time, the attention becomes like this: [image](https://user-images.githubusercontent.com/13804492/31314429-30ff442c-ac33-11e7-9da1-8613d7c43653.png)