Make omi work on macOS
Omi app Runs on macos perfectly, but when "try with phone microphone" button is clicked, nothing is being transcribed
what data it should capture
- [ ] from microphone
- [ ] system audio
Just like granola does it
/bounty $1000
-- thinh's comment: clarify the requirements https://github.com/BasedHardware/omi/issues/2010#issuecomment-2777470428
💎 $500 bounty • omi
Steps to solve:
- Start working: Comment
/attempt #2010with your implementation plan - Submit work: Create a pull request including
/claim #2010in the PR body to claim the bounty - Receive payment: 100% of the bounty is received 2-5 days post-reward. Make sure you are eligible for payouts
❗ Important guidelines:
- To claim a bounty, you need to provide a short demo video of your changes in your pull request
- If anything is unclear, ask for clarification before starting as this will help avoid potential rework
- Low quality AI PRs will not receive review and will be closed
- Do not ask to be assigned unless you've contributed before
Thank you for contributing to BasedHardware/omi!
| Attempt | Started (UTC) | Solution | Actions |
|---|---|---|---|
| 🟢 @cscoderr | Apr 01, 2025, 12:48:13 PM | #2141 | Reward |
| 🔴 @deekshatomer | Mar 16, 2025, 05:56:03 AM | WIP |
The issue occurs because the package used for recording doesn't support macOS, which is why transcription isn't working. My plan is to either find an alternative package that supports macOS or add macOS support to the current package.
/attempt #2010
Options
I managed to get the recording working on macOS and made some configurations. Some were straightforward, just copying the existing iOS setup. However, Firebase needs to be configured specifically for macOS, and the bundle ID also requires configuration for other social integrations to work. Despite this, I was able to get it running using most of the iOS configuration. This is what I have on my end
https://github.com/user-attachments/assets/3d36559f-a86c-4fe4-a46e-3ccd92ba5ecd
@kodjima33
https://github.com/BasedHardware/omi/pull/2045#issuecomment-2746817026
💡 @cscoderr submitted a pull request that claims the bounty. You can visit your bounty board to reward.
@kodjima33 Is the current macOS UI okay, or would you like me to update it to make it look more like a native Mac app?
Folks, I just want to clarify a bit.
Objective: The Omi AI app should work seamlessly with the audio system on macOS.
Key results:
- Captures the audio system on macOS for the meetings use case.
- Works on macOS with all core features: recording, transcribing, chat, apps.
References: https://www.granola.ai app
Tips: Check all references, and make sure you ask questions to clarify everything about the descriptions (a.k.a. the requirements) before jumping to the implementation.
@cscoderr @deekshatomer
Increasing bounty to $1,000
Can I get this assigned to me?
@beastoin @kodjima33 My previous implementation addressed the objective: "The Omi AI app should work seamlessly with the audio system on macOS." However, I got a bit confused when you mentioned integrating macOS-specific UI. If that's still part of the objective, I’m happy to reopen it and make the necessary updates—just let me know
Some thoughts - we first need to capture audio streams from both 1) microphone and 2) system audio in macOS. No need for opus encoding here - we can send the pcm stream to the transcription service. For maximum accuracy, we should run each stream (i.e. "me" audio and "them" audio) independently, then merge (this is how Granola works). This means that there will be 2 websocket connections w/ Deepgram endpoint and therefore transcription cost will be doubled. Now, there is an alternative - we can also consider running transcription locally via whisper model (https://github.com/argmaxinc/WhisperKit is excellent). Whisper-v3-large should be "good enough" for most conversations in English and other major languages.
@beastoin wdyt?
#2443 App on testflight, records both system + mic audio
still not done