pr-agent
pr-agent copied to clipboard
Enhance cost-efficiency by supporting OpenAI Flex Processing
Feature request
Introduce support for OpenAI Flex Processing to help reduce API costs for tasks that are not time-sensitive.
Flex Processing is designed for lower-priority or asynchronous workloads. It offers significantly reduced pricing in exchange for slower responses and occasional temporary unavailability. This trade-off is acceptable in many pr-agent use cases, such as PR summarization or suggestion generation.
Importantly, Flex requests can be retried with exponential backoff or fall back to standard processing when needed—ensuring reliability while still optimizing for cost.
Reference: https://platform.openai.com/docs/guides/flex-processing
Motivation
Many of pr-agent’s features are used in CI pipelines or as background automation—scenarios where instant responses are not critical. By enabling Flex Processing, teams can significantly cut down on OpenAI usage costs while maintaining functionality, making pr-agent more sustainable for frequent or large-scale use.