understanding-ai icon indicating copy to clipboard operation
understanding-ai copied to clipboard

https://github.com/showlab/Image2Paragraph

Open flrngel opened this issue 1 year ago • 0 comments

Summary

  • uses blip/blip2 to generate a simple caption
  • uses grit/detectron2 to generate a dense caption
  • uses segment anything to generate a region_semantic information
  • unify all above and prompt to GPT
  • canny the input image (which is the bullshit part) and generate the new image using StableDiffusionControlNetPipeline

Conclusion

  • The output prompt from this project cannot generate a similar image to the input without the canny image of the input.

flrngel avatar Apr 18 '23 20:04 flrngel