record with voice
This feature is just too good not to be added
Please add "record with voice" button
/bounty $100
💎 $1,000 bounty • omi
Steps to solve:
- Start working: Comment
/attempt #1953with your implementation plan - Submit work: Create a pull request including
/claim #1953in the PR body to claim the bounty - Receive payment: 100% of the bounty is received 2-5 days post-reward. Make sure you are eligible for payouts
Thank you for contributing to BasedHardware/omi!
Add a bounty • Share on socials
| Attempt | Started (GMT+0) | Solution |
|---|---|---|
| 🔴 @Gyan-max | Mar 8, 2025, 1:45:55 PM | WIP |
| 🟢 @Hashino | Mar 16, 2025, 11:54:23 AM | WIP |
| 🟢 @neooriginal | Mar 17, 2025, 11:49:26 AM | #2043 |
| 🟢 @AVtheking | #2046 |
@kodjima33 @algora-pbc
I’d like to work on this issue and wanted to clarify a few things:
- Is the bounty for this issue still open?
- What exactly does "disable the consumer device" mean?
- Should the device be permanently disabled, or should it be reversible?
- Is there an API endpoint or function already handling device activation that we should modify?
- Does disabling mean preventing further usage, turning off remotely, or something else?
Looking forward to your response. Thanks!
/claim #1953
Here's what we've implemented so far:
- Created a new
VoiceRecordingPagefor recording audio with the phone's microphone - Added methods to the
CaptureProviderto process phone recordings - Added a 'phone' source type to the
ConversationSourceenum - Updated the
getTag()method to display "Phone Recording" for phone-recorded conversations - Added a floating action button to the home page to access the voice recording feature
My implementation plan should summarize these steps and outline what else needs to be done to complete this feature.# Implementation Plan: Voice Recording Feature
Here's my plan to implement the voice recording feature (Issue #1953):
-
Create VoiceRecordingPage
- Create a new page with UI for recording audio using the phone's microphone
- Include timer display, recording status indicator, and start/stop controls
- Implement permissions handling for microphone access
-
Update CaptureProvider
- Add
processPhoneRecordingmethod to handle audio files recorded with the phone - Implement helper methods for creating and finalizing conversations with phone recordings
- Send recorded audio to the server for transcription via WebSocket
- Add
-
Update Conversation Schema
- Add 'phone' to the
ConversationSourceenum to distinguish phone recordings - Update
getTag()method to display "Phone Recording" for phone-recorded conversations
- Add 'phone' to the
-
Add UI Access Point
- Add a floating action button to the home page for quick access to voice recording
- Use red color and microphone icon to make it visually distinct
-
Integration Testing
- Test the complete flow from recording to transcription to conversation display
- Verify proper error handling for permissions and audio processing
- Confirm recordings are properly categorized as "Phone Recording" in the UI
This implementation makes recording directly with the phone easy and intuitive, while reusing the existing backend infrastructure for transcription and conversation management.
Options
@Gyan-max pls never come back again here with your AI shit
Increasing bounty to $200
/bounty $200
Increasing bounty to $1k if I get a PR today that works like a blast /bounty $1000
Important:
I need you to add "record with voice" icon, just like in chatgpt, that will listen to the voice from microphone. Once stopped,
Using the existing code that works on "Try with Phone Mic" button - it should receive a transcription and paste it in the chat
Try not to create any new variables or modules or components
Chat page:
working on it, will try to get it done by today
/attempt #1953
@neooriginal amazing but can you make it a little bit more like this pls?
https://github.com/user-attachments/assets/5b8d77db-db4e-453a-a005-2a4fe9ad9721
@AVtheking can you share demo?
@neooriginal amazing but can you make it a little bit more like this pls?
ScreenRecording_03-17-2025.13-23-18_1.MP4 @AVtheking can you share demo?
done
Using the existing code that works on "Try with Phone Mic" button - it should receive a transcription and paste it in the chat
folks, ensure you fully understand the requirements, or if you want to propose a better solution but doesn't fully align with the requirement, let talk to Nik to clarify it.
make sure we are on the same page.
for example, @neooriginal did you use a new STT library, not based on the current Try with Phone Mic ?
Hi @beastoin I tried implementing it by following the requirement, could you please take a look if the approach is correct . I will upload a demo video in a while , have to give exam in few hours 😅
I talked to Nik via telegram already. I can easily implement the backend stt. On device is superior though:
- way faster
- cheaper for you
- it does not have to be 1to1 accurate like in text messages because AI can interpret it
- apple one works fine since apple intelligence for me
- works offline
- id say even better then deepgram on android/pixel devices
-------- Ursprüngliche Nachricht -------- Am 18.03.25 11:49 um Thinh schrieb :
Using the existing code that works on "Try with Phone Mic" button - it should receive a transcription and paste it in the chat
folks, ensure you fully understand the requirements, or if you want to propose a better solution but doesn't fully align with the requirement, let talk to Nik to clarify it.
make sure we are on the same page.
for example, @.***(https://github.com/neooriginal) did you use a new STT library, not based on the current Try with Phone Mic ?
— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.***>
[beastoin]beastoin left a comment (BasedHardware/omi#1953)
Using the existing code that works on "Try with Phone Mic" button - it should receive a transcription and paste it in the chat
folks, ensure you fully understand the requirements, or if you want to propose a better solution but doesn't fully align with the requirement, let talk to Nik to clarify it.
make sure we are on the same page.
for example, @.***(https://github.com/neooriginal) did you use a new STT library, not based on the current Try with Phone Mic ?
— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.***>
@neooriginal tell me more about 3. 6., i doubted that the on-device STT is better than DG
no worries, Nik and I are on the same page, if you have talked to him and he's okay with it - it's fine with me.
@AVtheking haha you should be faster or, Neo could take this piece of cake, looking forward.
@neooriginal tell me more about 3. 6., i doubted that the on-device STT is better than DG
no worries, Nik and I are on the same page, if you have talked to him and he's okay with it - it's fine with me.
@AVtheking haha you should be faster or, Neo could take this piece of cake, looking forward.
Nik did not respond to me yet. He didn't seem totally against it though. Try it out yourself and if it's bad I can change it with 5 mins. I'm going to hurry up, need the cash
@neooriginal man, could you do it yourself ? if you want to take this ticket, you must be super strong on your proposed solution.
tips: tell me more about the 3. 6. based on your research, focus on the WER, and, use these on-table research findings to discuss with Nik.
it also great if you could try 2 approaches and compare them yourself.
@neooriginal man, could you do it yourself ? if you want to take this ticket, you must be super strong on your proposed solution.
tips: tell me more about the 3. 6. based on your research, focus on the WER, and, use these on-table research findings to discuss with Nik.
it also great if you could try 2 approaches and compare them yourself.
if i take this ticket and deliver good results will i definetly be selected and get the money? I would present my research but i do not want others to claim the ticket then. Quite the time pressure right now.
im working on just implementing backend stt. probably easier
ok everything is working. ready to review
Hi folks,
- Requirement updates: The top priority for this feature is to make it work with unstable internet. So, if I record for 10 minutes and the internet stops working for 5 minutes, and then once I stop recording, I connect to the internet, it should process the audio fully. Ref: https://github.com/BasedHardware/omi/pull/2043#issuecomment-2735452559
- Suggested solutions: https://github.com/BasedHardware/omi/pull/2043#issuecomment-2736188805 / https://github.com/BasedHardware/omi/pull/2043#issuecomment-2736281965
@beastoin are you finishing this off , or could I try it ?
@beastoin are you finishing this off , or could I try it ?
seems like he does: https://github.com/BasedHardware/omi/pull/2055/commits/9c82db689c69fd2e5cd32c707fba6465c11fadeb
sorry guys, the requirement changes make it harder for you to finish this ticket in the limited time.
we need to move fast so i have handled it by myself #2055
feel free to read it, and create new PR to enhance it if needed.
if you need some coffee to keep your caffeine level always high - feel free to ping me https://discord.omi.me @thinh
thank you for your time.
@Hashino @neooriginal @AVtheking
Product Change Logs
- Feature is ready on TestFlight / internal test
@kodjima33 congratulations 🚀