pipecat Implement "OpenAI Image Generation" service as an integration in `openai`

Service Name

gpt-image-1

Service Website

https://platform.openai.com/docs/api-reference/images

Service Description

The OpenAI API lets you generate and edit images from text prompts, using the GPT Image or DALL·E models.

The Image API provides three endpoints, each with distinct capabilities:

Generations: Generate images from scratch based on a text prompt
Edits: Modify existing images using a new prompt, either partially or entirely
Variations: Generate variations of an existing image (available with DALL·E 2 only)

You can also customize the output by specifying the quality, size, format, compression, and whether you would like a transparent background.

Reference: https://openai.com/index/image-generation-api/ https://platform.openai.com/docs/guides/image-generation?image-generation-model=gpt-image-1

API Information

API Docs: https://platform.openai.com/docs/api-reference/images Auth Method: HTTP Bearer Authentication (https://platform.openai.com/docs/api-reference/authentication)

Key endpoints:

Create image: https://platform.openai.com/docs/api-reference/images/create
Create image edit: https://platform.openai.com/docs/api-reference/images/createEdit
Create image variation: https://platform.openai.com/docs/api-reference/images/createVariation

Would you be willing to help implement this service?

[x] Yes, I'd like to contribute
[ ] No, I'm just suggesting

Apr 24 '25 21:04 oxcabe

@oxcabe this would be a great addition! If you get time to work on it, we'd love to include it.

Apr 25 '25 01:04 markbackman

Sure! My plan is to have sent an implementation PR before Tue 29. Otherwise, I'd be okay with someone else taking on this instead.

Apr 25 '25 11:04 oxcabe

Just started working on this @markbackman

I noticed there's an already existing image service in the openai package, at src/pipecat/services/openai/image.py, which implements dall-e-3 API access. Unexpectedly, I found nothing about this service in the docs.

Given the current state of things, what I intend to do is:

Implement gpt-image-1 on top of what's in there already. Refactoring, if required, will always be non-breaking.
Document the entire service in the same way it's been done for fal and Google Imagen.

AFAIK there's no integration tests for services, or am I wrong? Should I be pushing any tests to the upcoming PR?

Apr 26 '25 21:04 oxcabe

You're right, we were missing docs. I just fixed that: https://github.com/pipecat-ai/docs/pull/213.

That plan sounds good!

For now, don't worry about integration tests. We need to write examples of how to do this for services first.

Thanks again for adding this!

Apr 28 '25 13:04 markbackman

Quick update: I'm delayed on my plans as we're having a cross-country power outage here in Spain 😅

Currently having my OpenAI organisation account verified to have access to gpt-image-1. I'm also making changes related to improving maintainability. Although it's still non-breaking, I'm interested in deprecating the current service arguments into the same approach the rest of Image services use.

Apr 29 '25 13:04 oxcabe

Hope you get power back ASAP 🙏

Apr 30 '25 03:04 markbackman