ThisisBillhe comments

Results 40 comments of


                                            ThisisBillhe

是否支持BGEMM？

> 目前不支持，这个是cnn网络吗？可以换用等价写法，用conv 1x1，注意输出通道数是32倍数是linear层。主要是想了解一下transformer二值化的可能性，当然全都换成conv1*1也不是不行。

生成数据时，提取的网络激活位置是否有bug?

确实，如果Conv后不是BN的话就会有这个问题，比如EfficientNet。（这个仓库是这位UCB教授的，所以他可能看不懂中文hh）

Can't open port 80 for web service

maybe my phone hasn't root,I think

Reproducibility of FID scores

same problem here, do not have a clue. Does multi-gpu generation matter?

How to reproduce result of FID 3.60 over LDM-4-G on ImageNet?

> 1. Yes > 2. I successfully reproduce the result by uniformally generate 50 images from each class. Results is shown below(IS by torch-fidelity, others by [guided_diffusion evaluation code](https://github.com/openai/guided-diffusion/tree/main/evaluations). The...

[Feature Request] 4bit and 2bit and 1bit quantization support

Hi everyone! I have successfully quantized a diffusion model to 2-bit and manually packed them into uint8 format (store 4x 2-bit weight in an uint8 variable) in pytorch. During inference,...

[Feature Request] 4bit and 2bit and 1bit quantization support

> > Hi everyone! I have successfully quantized a diffusion model to 2-bit and manually packed them into uint8 format (store 4x 2-bit weight in an uint8 variable) in pytorch....

build failure on rhel7 python 3.10.4 gcc 9.4.1

any ideas how to do that? > this looks similar, it says to upgrade cython and regenerate the file: [mcfletch/pyopengl#74](https://github.com/mcfletch/pyopengl/issues/74)

[Usage] 我在一个A100 80GB下进行全模型训练，告诉我显存不够

same OOM problem here when finetuning 7B models on a single A100-80G. The error log is exactly the same as @Williamsunsir . Theoretically, finetuning a 7B model takes 14G (model...

Question about the reproduction of XSUM results

Hi everyone, I have another question regarding reproducing XSUM results. In h2o_hf/scripts/summarization/eval.sh, it sets a fixed HH_SIZE and RECENT_SIZE, but the x-axis of figure 4 represents KV Cache Budget (%),...