panacea
panacea copied to clipboard
Different text prompts generate almost the same video generation result.
Thanks for your great work. However, in our experiment, we tried different text prompts according to your instruction (the dataset preparation and inference code), the video generation results are almost the same. Is there anything wrong?
Hi,sorry for the lat reply. The current code use the GT image as conditional frame and generate the subsequent video frames for inference, so modifying the text prompt cannot modify the textual attributes well because the subsequent video frames are highly correlated with the conditional frame.