Haoning Wu
Haoning Wu
You can start by generating the first frame of the story in single-frame mode (set stage = ‘no’). Then, use the generated frame as the ref_image to generate the next...
1. Since our model is designed based on SDM, all SDM pre-trained parameters need to be used; 2. Please refer to the official implementation repository of yolov7: https://github.com/WongKinYiu/yolov7 3. We...
Thank you for your question. I will address it based on the following points: 1. **Input Transformation**: Please refer to the guidelines in previous issues (https://github.com/haoningwu3639/StoryGen/issues/10#issuecomment-2002906594 and https://github.com/haoningwu3639/StoryGen/issues/14#issuecomment-2021797561). You need...
The dataset we released contains narration and descriptions generated by TextBind, which can be used directly. We also tried MiniGPT-v2. To be honest, it generates better descriptions, but it does...
Please refer to our paper for the distinction between **narrative text** and **descriptive text**. Considering that descriptive text is more suitable as prompts for text-to-image models, we transformed the stories...
Sorry for the confusion. When exporting the environment, I included all the libraries I commonly use. However, for the StoryGen project, Detectron2 is not a necessary dependency and can be...
Sorry for the late reply. For the questions you raised, I have the following suggestions: 1. First of all, introducing more and higher quality data will help the generation effect;...
Indeed, the performance of Flux.1-dev and Flux.1-schnell is far superior to Stable Diffusion 1.5 and SDXL. Recently, I have also been working on building a code framework for fine-tuning Flux...
Since it's been a long time since I trained and tested these baselines, maybe I've forgotten some of the details. However, I remember that we adopted the official AR-LDM code...
建议检查一下GPU的利用率和显存占用率以及CPU使用率,并且先确认一下torch是否在正常使用GPU?如果model被load到了CPU上,有可能会一直以极慢的速度卡住。