omi record with voice

This feature is just too good not to be added

Please add "record with voice" button

/bounty $100

Mar 05 '25 07:03 kodjima33

💎 $1,000 bounty • omi

Steps to solve:

Start working: Comment /attempt #1953 with your implementation plan
Submit work: Create a pull request including /claim #1953 in the PR body to claim the bounty
Receive payment: 100% of the bounty is received 2-5 days post-reward. Make sure you are eligible for payouts

Thank you for contributing to BasedHardware/omi!

Add a bounty • Share on socials

Attempt	Started (GMT+0)	Solution
🔴 @Gyan-max	Mar 8, 2025, 1:45:55 PM	WIP
🟢 @Hashino	Mar 16, 2025, 11:54:23 AM	WIP
🟢 @neooriginal	Mar 17, 2025, 11:49:26 AM	#2043
🟢 @AVtheking		#2046

Mar 05 '25 07:03 algora-pbc[bot]

@kodjima33 @algora-pbc
I’d like to work on this issue and wanted to clarify a few things:

Is the bounty for this issue still open?
What exactly does "disable the consumer device" mean?
- Should the device be permanently disabled, or should it be reversible?
- Is there an API endpoint or function already handling device activation that we should modify?
- Does disabling mean preventing further usage, turning off remotely, or something else?

Looking forward to your response. Thanks!

Mar 06 '25 18:03 kotesh-arya

/claim #1953 The user is asking for a brief implementation plan for the GitHub issue #1953: "record with voice" feature. Based on the files I've been shown and the prior conversation, I can see that we've already implemented much of this feature.

Here's what we've implemented so far:

Created a new VoiceRecordingPage for recording audio with the phone's microphone
Added methods to the CaptureProvider to process phone recordings
Added a 'phone' source type to the ConversationSource enum
Updated the getTag() method to display "Phone Recording" for phone-recorded conversations
Added a floating action button to the home page to access the voice recording feature

My implementation plan should summarize these steps and outline what else needs to be done to complete this feature.# Implementation Plan: Voice Recording Feature

Here's my plan to implement the voice recording feature (Issue #1953):

Create VoiceRecordingPage
- Create a new page with UI for recording audio using the phone's microphone
- Include timer display, recording status indicator, and start/stop controls
- Implement permissions handling for microphone access
Update CaptureProvider
- Add processPhoneRecording method to handle audio files recorded with the phone
- Implement helper methods for creating and finalizing conversations with phone recordings
- Send recorded audio to the server for transcription via WebSocket
Update Conversation Schema
- Add 'phone' to the ConversationSource enum to distinguish phone recordings
- Update getTag() method to display "Phone Recording" for phone-recorded conversations
Add UI Access Point
- Add a floating action button to the home page for quick access to voice recording
- Use red color and microphone icon to make it visually distinct
Integration Testing
- Test the complete flow from recording to transcription to conversation display
- Verify proper error handling for permissions and audio processing
- Confirm recordings are properly categorized as "Phone Recording" in the UI

This implementation makes recording directly with the phone easy and intuitive, while reusing the existing backend infrastructure for transcription and conversation management.

Options

Cancel my attempt

Mar 08 '25 13:03 Gyan-max

@Gyan-max pls never come back again here with your AI shit

Mar 13 '25 20:03 kodjima33

/attempt #1953

Options

Cancel my attempt

Mar 16 '25 11:03 Hashino

Increasing bounty to $200

/bounty $200

Mar 17 '25 06:03 kodjima33

Increasing bounty to $1k if I get a PR today that works like a blast /bounty $1000

Important:

I need you to add "record with voice" icon, just like in chatgpt, that will listen to the voice from microphone. Once stopped,

Using the existing code that works on "Try with Phone Mic" button - it should receive a transcription and paste it in the chat

Try not to create any new variables or modules or components

Chat page:

Mar 17 '25 08:03 kodjima33

working on it, will try to get it done by today

Mar 17 '25 09:03 neooriginal

/attempt #1953

Mar 17 '25 11:03 neooriginal

@neooriginal amazing but can you make it a little bit more like this pls?

https://github.com/user-attachments/assets/5b8d77db-db4e-453a-a005-2a4fe9ad9721

@AVtheking can you share demo?

Mar 17 '25 20:03 kodjima33

@neooriginal amazing but can you make it a little bit more like this pls?

ScreenRecording_03-17-2025.13-23-18_1.MP4 @AVtheking can you share demo?

done

Mar 17 '25 22:03 neooriginal

Using the existing code that works on "Try with Phone Mic" button - it should receive a transcription and paste it in the chat

folks, ensure you fully understand the requirements, or if you want to propose a better solution but doesn't fully align with the requirement, let talk to Nik to clarify it.

make sure we are on the same page.

for example, @neooriginal did you use a new STT library, not based on the current Try with Phone Mic ?

Mar 18 '25 01:03 beastoin

Hi @beastoin I tried implementing it by following the requirement, could you please take a look if the approach is correct . I will upload a demo video in a while , have to give exam in few hours 😅

Mar 18 '25 02:03 AVtheking

I talked to Nik via telegram already. I can easily implement the backend stt. On device is superior though:

way faster
cheaper for you
it does not have to be 1to1 accurate like in text messages because AI can interpret it
apple one works fine since apple intelligence for me
works offline
id say even better then deepgram on android/pixel devices

-------- Ursprüngliche Nachricht -------- Am 18.03.25 11:49 um Thinh schrieb :

Using the existing code that works on "Try with Phone Mic" button - it should receive a transcription and paste it in the chat

folks, ensure you fully understand the requirements, or if you want to propose a better solution but doesn't fully align with the requirement, let talk to Nik to clarify it.

make sure we are on the same page.

for example, @.***(https://github.com/neooriginal) did you use a new STT library, not based on the current Try with Phone Mic ?

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.***>

[beastoin]beastoin left a comment (BasedHardware/omi#1953)

Using the existing code that works on "Try with Phone Mic" button - it should receive a transcription and paste it in the chat

folks, ensure you fully understand the requirements, or if you want to propose a better solution but doesn't fully align with the requirement, let talk to Nik to clarify it.

make sure we are on the same page.

for example, @.***(https://github.com/neooriginal) did you use a new STT library, not based on the current Try with Phone Mic ?

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.***>

Mar 18 '25 02:03 neooriginal

@neooriginal tell me more about 3. 6., i doubted that the on-device STT is better than DG

no worries, Nik and I are on the same page, if you have talked to him and he's okay with it - it's fine with me.

@AVtheking haha you should be faster or, Neo could take this piece of cake, looking forward.

Mar 18 '25 02:03 beastoin

@neooriginal tell me more about 3. 6., i doubted that the on-device STT is better than DG

no worries, Nik and I are on the same page, if you have talked to him and he's okay with it - it's fine with me.

@AVtheking haha you should be faster or, Neo could take this piece of cake, looking forward.

Nik did not respond to me yet. He didn't seem totally against it though. Try it out yourself and if it's bad I can change it with 5 mins. I'm going to hurry up, need the cash

Mar 18 '25 02:03 neooriginal

@neooriginal man, could you do it yourself ? if you want to take this ticket, you must be super strong on your proposed solution.

tips: tell me more about the 3. 6. based on your research, focus on the WER, and, use these on-table research findings to discuss with Nik.

it also great if you could try 2 approaches and compare them yourself.

Mar 18 '25 02:03 beastoin

@neooriginal man, could you do it yourself ? if you want to take this ticket, you must be super strong on your proposed solution.

tips: tell me more about the 3. 6. based on your research, focus on the WER, and, use these on-table research findings to discuss with Nik.

it also great if you could try 2 approaches and compare them yourself.

if i take this ticket and deliver good results will i definetly be selected and get the money? I would present my research but i do not want others to claim the ticket then. Quite the time pressure right now.

im working on just implementing backend stt. probably easier

Mar 18 '25 03:03 neooriginal

ok everything is working. ready to review

Mar 18 '25 03:03 neooriginal

Hi folks,

Requirement updates: The top priority for this feature is to make it work with unstable internet. So, if I record for 10 minutes and the internet stops working for 5 minutes, and then once I stop recording, I connect to the internet, it should process the audio fully. Ref: https://github.com/BasedHardware/omi/pull/2043#issuecomment-2735452559
Suggested solutions: https://github.com/BasedHardware/omi/pull/2043#issuecomment-2736188805 / https://github.com/BasedHardware/omi/pull/2043#issuecomment-2736281965

Mar 19 '25 11:03 beastoin

@beastoin are you finishing this off , or could I try it ?

Mar 20 '25 11:03 AVtheking

@beastoin are you finishing this off , or could I try it ?

seems like he does: https://github.com/BasedHardware/omi/pull/2055/commits/9c82db689c69fd2e5cd32c707fba6465c11fadeb

Mar 20 '25 11:03 neooriginal

sorry guys, the requirement changes make it harder for you to finish this ticket in the limited time.

we need to move fast so i have handled it by myself #2055

feel free to read it, and create new PR to enhance it if needed.

if you need some coffee to keep your caffeine level always high - feel free to ping me https://discord.omi.me @thinh

thank you for your time.

@Hashino @neooriginal @AVtheking

Mar 21 '25 03:03 beastoin

Product Change Logs

Feature is ready on TestFlight / internal test

@kodjima33 congratulations 🚀

Mar 21 '25 05:03 beastoin