open-webui
open-webui copied to clipboard
fix: Custom OpenAI-TTS URL to fetch actual voices and models
Pull Request Checklist
Note to first-time contributors: Please open a discussion post in Discussions and describe your changes before submitting a pull request.
Before submitting, make sure you've checked the following:
- [x] Target branch: Please verify that the pull request targets the
devbranch. - [x] Description: Provide a concise description of the changes made in this pull request.
- [x] Changelog: Ensure a changelog entry following the format of Keep a Changelog is added at the bottom of the PR description.
- [ ] Documentation: Have you updated relevant documentation Open WebUI Docs, or other documentation sources?
- [ ] Dependencies: Are there any new dependencies? Have you updated the dependency versions in the documentation?
- [x] Testing: Have you written and run sufficient tests for validating the changes?
- [x] Code review: Have you performed a self-review of your code, addressing any coding standard issues and ensuring adherence to the project's coding standards?
- [x] Prefix: To cleary categorize this pull request, prefix the pull request title, using one of the following:
- BREAKING CHANGE: Significant changes that may affect compatibility
- build: Changes that affect the build system or external dependencies
- ci: Changes to our continuous integration processes or workflows
- chore: Refactor, cleanup, or other non-functional code changes
- docs: Documentation update or addition
- feat: Introduces a new feature or enhancement to the codebase
- fix: Bug fix or error correction
- i18n: Internationalization or localization changes
- perf: Performance improvement
- refactor: Code restructuring for better maintainability, readability, or scalability
- style: Changes that do not affect the meaning of the code (white-space, formatting, missing semi-colons, etc.)
- test: Adding missing tests or correcting existing tests
- WIP: Work in progress, a temporary label for incomplete or ongoing work
Changelog Entry
Description
- This PR updates the OpenAI-TTS endpoints so that if the supplied TTS_OPENAI_API_BASE_URL does not start with 'https://api.openai.com', if it does not match we fetch voices and models from the custom endpoint TTS_OPENAI_API_BASE_URL ({TTS_OPENAI_API_BASE_URL}/audio/voices and {TTS_OPENAI_API_BASE_URL}/audio/models) instead of using the previously hard-coded official OpenAI-TTS-specific defaults.
Changed
- Changed always serving hardcoded official OpenAI-TTS voices and models when using non-official OpenAI-TTS URL to fetch actual voices and models from custom OpenAI-TTS API URL
Fixed
- I saw the original functionality as a bug, hence this PR, though you could consider this a fix?
I'm curious which project the custom endpoint comes from? Is it Kokoro-FastAPI?
I ran into a similar dilemma on a project that I maintain, because the OpenAI spec doesn't have an official way for retrieving the voice list. My solution was I manually checked which type of backend it was (Kokoro-FastAPI, Speaches, or official OpenAI) before sending a request. This is because they all use different specifications. Please see "list_supported_voices" in https://github.com/roryeckel/wyoming_openai/blob/main/compatibility.py
I'm curious which project the custom endpoint comes from? Is it Kokoro-FastAPI?
It's for a TTS project I made which I have been working on.
I set my TTS_OPENAI_API_BASE_URL to https://<my-openai-style-tts-domain>/v1 and I noticed that when trying to use the Open-WebUI API (ie: https://<open-webui-domain>/api/v1/audio/models) it never requests /v1/audio/models on my project's endpoint (or any endpoint destination on my custom URL for that matter).
Instead, it would always just forward what you'd normally get when requesting via the Open-WebUI spec (which in hindsight I wrongly assumed for some reason followed this: https://<open-webui-domain>/docs#/audio/get_voices_api_v1_audio_voices_get.) which are just the hardcoded OpenAI official voices/models.
I ran into a similar dilemma on a project that I maintain, because the OpenAI spec doesn't have an official spec for retrieving the voice list. My solution was I manually checked which type of backend it was (Kokoro-FastAPI, Speaches, or official OpenAI) before sending a request. This is because they all use different specifications. Please see "list_supported_voices" in https://github.com/roryeckel/wyoming_openai/blob/main/compatibility.py
So you are correct, I guess my original description of it following the OpenAI-spec is incorrect! Rather my changes will just request the same endpoints that are listed in the Open-WebUI API docs, so I'm not sure if this would be an appropriate change to implement now that you mention that and I double-checked the OpenAI docs again.
Please target our dev branch, Thanks!