Huang Haiduo comments

Results 26 comments of


                                            Huang Haiduo

attention map visualization code

> @haiduo 当然，代码没有好好整理，不过应该可以直接跑。 https://drive.google.com/drive/folders/1okWf2noIrnDdOBNYUfl_iZe3LLjb_59n?usp=share_link > > @ysj9909 Very sorry about the late reply! The visualization code can be downloaded here: https://drive.google.com/drive/folders/1okWf2noIrnDdOBNYUfl_iZe3LLjb_59n?usp=share_link It is not well re-organized, but should work well....

No module named 'transformers.models.qwen_parallel.utils_qwen'

You can comment the two lines https://github.com/microsoft/LMOps/blob/daf972124f0699af18acee85473fece80fb405c2/minillm/tools/convert_mp.py#L11-L20 `11 and 20`

No module named 'transformers.models.qwen_parallel.utils_qwen'

> > You can comment the two lines > > https://github.com/microsoft/LMOps/blob/daf972124f0699af18acee85473fece80fb405c2/minillm/tools/convert_mp.py#L11-L20 > > > > `11 and 20` > > well, If I want to use Qwen model，where to find...

AB matrix initialization in layers.py does not conform to the description of the paper

> @edwardjhu can you please tell us why at least one of A or B has to be non-zero? May be the paper say that? Ensure that at the beginning...

Evolution search results

我感觉不用BN重新标定，搜索的子网的精度反而会更好，也许作者为了筛选出更好的，所以需要BN标定？

What is the purpose of setting flops bins when training SuperNet?

感觉是为了让所有的supernet更倾向于[290, 360]之间进行学习，这样搜索的时候，320M左右的子网也就普遍精度较高，一次性囊括所有范围的子网感觉学起来难度很大，所以可以认为作者这里算是一个实验trick吗？

Building cumm for spconv 2x Issue

Mark!

Constrain the scaling factors of the two ranges

> Thank you for the reply again! > > I hope this problem can be fixed, and I'll cross my fingers for your current work! The implementation here seems to...

Constrain the scaling factors of the two ranges

[ self.a_interval.append(0.16997124254703522/self.a_qmax)](https://github.com/hahnyuan/PTQ4ViT/blob/b42c790c2811e7835d435a9f9a6c64a079a7483a/quant_layers/linear.py#L320) Hi @SuperVan-Young , the scale of the negative value area to this? Is it the result of parameter adjustment or random?

Bad performance on CIFAR using on low bit width

> Hi @Ahmad-Jarrar , sorry for this, the quantization scheme proposed in the paper does not converge for low bits, and some modification is necessary. I remembered I posted this......