Pengxiang Li

Results 64 comments of Pengxiang Li

Yes, you can find the corresponding weights on Hugging Face

This is precisely the problem I am facing at the moment. If we want to do text2video, the existence of image_latents is quite peculiar. I've tried changing the `conv in`...

It looks like it's working well, may I ask how many steps this was trained for?

hi, @LTH14, since I'm new to this field, I have a beginner's question. Can I understand unconditional generation to be the pipeline in the diagram below without the Rep. Dist.?...

Thank you very much for your response, I have another question concerning whether the current unconditional image generation models are unable to perform an implicit denoising of a Rep. Dist....

hi, [ersanliqiao](https://github.com/ersanliqiao) Can you provide some more detailed information?

Thank you very much for your appreciation. We will continue to iterate the version in the future, hoping for a more accurate understanding of timing in the video. Of course,...

I'm sorry, at the beginning of writing this code, I was more focused on supporting SVD training and didn't consider the memory issues much. This has caused some inconvenience to...

Thanks for pointing this out, @xuehy, and thanks @potatoQi for echoing the concern. Yes, if different processes (especially on different GPUs) are getting the exact same data at each iteration...