rig icon indicating copy to clipboard operation
rig copied to clipboard

feat: Image input (gpt-4 vision preview) support

Open yiwufen opened this issue 3 months ago • 2 comments

Inquiry About Image Input Support (e.g., Using GPT-4 Vision Preview API)

Description

Hello,

I am currently using your framework for project development and would like to know if it supports image input functionalities. Specifically, I am interested in whether it's possible to integrate and utilize OpenAI's GPT-4 Vision Preview API for processing and analyzing image data.

Questions

  1. Image Input Support
    Does the framework have built-in modules or functionalities that support image inputs?

  2. Usage Examples or Documentation
    If supported, could you provide relevant usage examples or links to documentation to guide me on how to integrate and use this feature?

  3. Future Feature Plans
    If image input is not currently supported, are there any plans to include this functionality in future releases?

  4. Integration with Third-Party Libraries
    If the framework does not support image inputs at the moment, do you have any recommended methods or third-party libraries that can be easily integrated to add image input capabilities?

Additional Information

To better meet my project requirements, I aim to leverage both image processing capabilities and existing text processing features. If there are any example codes or best practices available, I would greatly appreciate it if you could share them.

Thank you for your assistance!

yiwufen avatar Nov 25 '24 09:11 yiwufen