Handy icon indicating copy to clipboard operation
Handy copied to clipboard

feat: Implement OpenAI style local API server for audio transcription

Open Yorick-Ryu opened this issue 2 weeks ago • 5 comments

Before Submitting This PR

Please confirm you have done the following:

  • [X] I have searched existing issues and pull requests (including closed ones) to ensure this isn't a duplicate
  • [X] I have read CONTRIBUTING.md

If this is a feature or change that was previously closed/rejected:

  • [ ] I have explained in the description below why this should be reconsidered
  • [ ] I have gathered community feedback (link to discussion below)

Human Written Description

I implemented a local STT API that follows the OpenAI Whisper format. Currently, the Whisper model is only accessible within Handy; however, many users want to leverage this functionality for external tasks like subtitle transcription without loading multiple model instances. This change exposes the speech-to-text capability as a standardized service, allowing users to do more with limited system memory.

Related Issues/Discussions

Fixes # None Discussion: https://github.com/cjpais/Handy/discussions/241

Community Feedback

https://github.com/cjpais/Handy/discussions/241

Testing

Environment:

  • Tested on: macOS 26.2 (Apple Silicon M1 Pro)
  • Status: Functional on macOS. Need help testing on Windows and Linux platforms to ensure consistent behavior.

Test Cases:

  • Features: Tested by calling the API using curl and Demo: convert MP3 to SRT
  • On-demand Loading: Verified via curl that calling the /v1/audio/transcriptions endpoint correctly triggers the model loading process in the background.
  • Waiting Mechanism: Confirmed the API response waits until the model is fully loaded before processing the transcription, preventing "Model not loaded" errors.
  • Verified Limitations: Tested various audio formats and confirmed only MP3 currently works reliably; documented this behavior and added a "welcome PRs" note in LOCAL_API.md to guide future contributors.

Screenshots/Videos (if applicable)

image2 image1

Yorick-Ryu avatar Jan 02 '26 14:01 Yorick-Ryu