Implement "OpenAI Image Generation" service as an integration in `openai`
Service Name
gpt-image-1
Service Website
https://platform.openai.com/docs/api-reference/images
Service Description
The OpenAI API lets you generate and edit images from text prompts, using the GPT Image or DALL·E models.
The Image API provides three endpoints, each with distinct capabilities:
- Generations: Generate images from scratch based on a text prompt
- Edits: Modify existing images using a new prompt, either partially or entirely
- Variations: Generate variations of an existing image (available with DALL·E 2 only)
You can also customize the output by specifying the quality, size, format, compression, and whether you would like a transparent background.
Reference: https://openai.com/index/image-generation-api/ https://platform.openai.com/docs/guides/image-generation?image-generation-model=gpt-image-1
API Information
API Docs: https://platform.openai.com/docs/api-reference/images Auth Method: HTTP Bearer Authentication (https://platform.openai.com/docs/api-reference/authentication)
Key endpoints:
- Create image: https://platform.openai.com/docs/api-reference/images/create
- Create image edit: https://platform.openai.com/docs/api-reference/images/createEdit
- Create image variation: https://platform.openai.com/docs/api-reference/images/createVariation
Would you be willing to help implement this service?
- [x] Yes, I'd like to contribute
- [ ] No, I'm just suggesting
@oxcabe this would be a great addition! If you get time to work on it, we'd love to include it.
Sure! My plan is to have sent an implementation PR before Tue 29. Otherwise, I'd be okay with someone else taking on this instead.
Just started working on this @markbackman
I noticed there's an already existing image service in the openai package, at src/pipecat/services/openai/image.py, which implements dall-e-3 API access. Unexpectedly, I found nothing about this service in the docs.
Given the current state of things, what I intend to do is:
- Implement
gpt-image-1on top of what's in there already. Refactoring, if required, will always be non-breaking. - Document the entire service in the same way it's been done for fal and Google Imagen.
AFAIK there's no integration tests for services, or am I wrong? Should I be pushing any tests to the upcoming PR?
You're right, we were missing docs. I just fixed that: https://github.com/pipecat-ai/docs/pull/213.
That plan sounds good!
For now, don't worry about integration tests. We need to write examples of how to do this for services first.
Thanks again for adding this!
Quick update: I'm delayed on my plans as we're having a cross-country power outage here in Spain 😅
Currently having my OpenAI organisation account verified to have access to gpt-image-1.
I'm also making changes related to improving maintainability. Although it's still non-breaking, I'm interested in deprecating the current service arguments into the same approach the rest of Image services use.
Hope you get power back ASAP 🙏