q-diffusion icon indicating copy to clipboard operation
q-diffusion copied to clipboard

Question about the inference process

Open JiaojiaoYe1994 opened this issue 1 year ago • 0 comments

Thank you for the cool job! After reading the paper and reproducing the result, I have a question regarding the inference part.

The inference of quantized model should be based on the quantized model, why should we load the FP32 model first? Take txt2img.py for example, why should we load the original FP32 model, i.e. sd-v1-4.ckpt , then load the quantized model, i.e. sd_w8a8_ckpt.pth to run inference?

The detailed implementation is in https://github.com/Xiuyu-Li/q-diffusion/blob/94fd0ecabc6e7545208c4809d84df091999ce4ad/scripts/txt2img.py#L311, which tries to load the full precision model.

JiaojiaoYe1994 avatar Aug 01 '23 04:08 JiaojiaoYe1994