cccpr

Results 67 comments of cccpr

@ruotianluo looks like the images in this [paper](https://www.cise.ufl.edu/~zizhao/mdnet.html)

@ruotianluo in case it is unclear, in the code below, I just visualize the "weight" tensor on the input image after resize it from 1 * 49 to 7 *...

@ruotianluo ``` import numpy as np import cv2 from scipy.misc import imread, imsave,imresize def normalize_attention(cam,size): #cam = imresize(cam, (size,size)) cam = cam - np.min(cam) cam_img = cam / np.max(cam) cam_img...

@ruotianluo so, no technical mistakes or bugs?

I also have several doubts: 1) Are all layers(including first and last layers) quantized in your experiments? 2) is the activation also quantized in your experiments? 3) why there is...

@MarceloGennari you can compare with : [1] Zhang D, Yang J, Ye D, et al. Lq-nets: Learned quantization for highly accurate and compact deep neural networks[C]//Proceedings of the European Conference...

@MarceloGennari 1. when will you release more experiment codes(training code)? 2. To put it straight, can I say that, the activations in your paper **is NOT quantized**? If it is...

@tlrmchlsmth Any plan on w4a8 quantization support?

@tonylins It seems that AWQ convert the quantization error from weight to activation, since the activation is fp16 in AWQ, have you done any experiments of int4 or int8 activation...

@HandH1998 It seems that, if w4a8 is per-channel quantized without group( group_size=-1), the w8a8 triton kernel [in lmdeploy repo](https://github.com/InternLM/lmdeploy/blob/main/lmdeploy/pytorch/kernels/cuda/w8a8_triton_kernels.py) can be easily modified into a w4a8 one. Will the speedup...