Inference results didn't perform as expected
Hi Haoning,
I was trying inference.py, annd the results seem to be underperforming. I was wondering if you could help me check if I did everything correctly. I downloaded checkpoint_Story from https://huggingface.co/haoningwu/StoryGen/tree/main, and downloaded all requirements from environmenbt.yaml with some problems with packages that required c++ (libtorch_cuda_cu.so and xformers). Then I ran with these parameters:
The results looked somewhat like this:
The boy and the dog are all very different from the images I provided. Could you give more guide in terms of how to use inference.py, difference between "multi-image-condition" and "auto-regressive", and if I used the inputs correctly? Or these degradation are results from the c++ packages? I feel that's less likely though, since they appear to be packages for efficiency.
Thanks a lot!
Sorry for the late response, it seems that you are trying to generate with multiple reference characters, which is a more challenging problem. In the supplementary of our paper, we provide an example of multi-character generation, but it is slightly less stable than single-character generation because we have not trained specifically for this situation. Also, I recommend you use auto-regressive manner rather than multi-image-condition.