ApolloRay
ApolloRay
I have solved this problem, and streamdiffusion can support SDXL model. For this question, I have to change the vae dtype to fp32, otherwise the result will be overflow. OHOHOHOHOH!!!!!
> @ApolloRay Nice, can you share your method with us? And show us the txt2img speed ? Thanks. I will refine my code and release soon.
`import time from tkinter import X from typing import List, Optional, Union, Any, Dict, Tuple, Literal import numpy as np import PIL.Image import torch from diffusers import LCMScheduler, StableDiffusionPipeline, StableDiffusionXLPipeline...
#ADD represent for added code.
> I have this problem when I launch, please help~ streamdiffusion can't support sdxl model.
足够了,推理耗时有统计吗?如果A10卡推理能撑得住吗?
感觉耗时还是比较久的?看着像wav2lip + 视频源的方式推理实现的
> > 感觉耗时还是比较久的?看着像wav2lip + 视频源的方式推理实现的 > > 确实不是用的换嘴,可以看这个地址 http://smartcan-zzx-zhaolei-a3-1.smartcan-service.1997894777848128.cn-beijing.fc.devsapp.net/ 会发现手部的动作和声音是一致的 > > 耗时长是因为用的阿里云的gpu服务,本地可以做到实时 这个是怎么实现的呢?有论文吗?
> We need more **TRUELY** excellent work like this~ **M**ake **A**IGC **G**reat **A**gain! 百度大佬
> 期待作者优化 给你几个优化方向:1)Unet的训练方式告诉你了,你可以考虑训练一个LCM-LoRA,25步降噪到4~6步降噪可以优化70%以上的时间 2)参考目前蒸馏的方法,大概能减少30%的时间 3)考虑对模型进行量化处理,可以优化30%~40%的时间。