understanding-ai
understanding-ai copied to clipboard

Published 20 hours ago •

Reame
Issues

https://github.com/showlab/Image2Paragraph

Open flrngel opened this issue 1 year ago • 0 comments

Summary

uses blip/blip2 to generate a simple caption
uses grit/detectron2 to generate a dense caption
uses segment anything to generate a region_semantic information
unify all above and prompt to GPT
canny the input image (which is the bullshit part) and generate the new image using StableDiffusionControlNetPipeline

Conclusion

The output prompt from this project cannot generate a similar image to the input without the canny image of the input.

Apr 18 '23 20:04 flrngel