Make a demo of device speaking ($2000)
Is your feature request related to a problem? Please describe. People want to talk to the device. Our v2 device is equipped with a speaker but doesn't speak yet
Describe the solution you'd like Solution should be exactly like our in-app chat but with audio from device. For example, I click on the button (on device) and ask "hey what's the capital of united states" and device should respond with audio "Washington DC". This response and prompt should be visible on the chat inside of the app
This functionality should be disablable via settings.
Additional context There was a PR submitted a while ago to make the app speak, take a look at that
This is a paid task. Reward is $2,000 in cash. Simply link your PR to this issue to receive the mo eh
sweet!
I can get this working but would be using Deepgram only.
Go a head with a PR pls @DamienDeepgram , you don't need a permission to do great things.
@hoai265 you too - great firmware dev man
Sorry guys had to update the bounty
so sad, no one take this :dead: too sweet $1K
Hi @kodjima33 , I’d love to take this task. Please assign it to me, and I’ll get started. Thanks!
@Sanchay-T yes, pls keep us updated! every 24h would be great ~ just to keep your motivation up!
Hi @beastoin 👋
I've been diving into the codebase to understand how we can implement voice responses for the device. Really interesting architecture you've built here! While I'm familiar with backend systems, I'm getting up to speed with some of the Flutter and BLE specifics.
Looking at the current audio pipeline, I can see we're handling real-time streaming through WebSockets pretty elegantly. The socket service in app/lib/services/sockets.dart seems to be the core of this:
Future<TranscriptSegmentSocketService?> socket({
required BleAudioCodec codec,
required int sampleRate,
required String language,
bool force = false,
}) async {
For implementing the voice response feature, I think we can build on this foundation. I see we're already integrated with OpenAI's APIs in backend/http/openai.dart, which could be extended for text-to-speech capabilities.
I have a few questions about the device interaction part though. Looking at app/lib/services/devices/models.dart, I see how we're handling BLE characteristics:
BluetoothCharacteristic? getCharacteristicByUuid(BluetoothService service, String uuid) {
return service.characteristics.firstWhereOrNull(
(characteristic) => characteristic.uuid.str128.toLowerCase() == uuid.toLowerCase(),
);
}
Before I proceed with the implementation, I wanted to check:
-
For handling button presses - what would be the best way to detect when the user wants to trigger a voice command? Should we use an existing characteristic or define a new one?
-
Regarding audio playback through the device's speaker - are there any specific format requirements or limitations I should be aware of?
-
For the chat interface, I see we're using the Memory system to handle conversations. Would adding voice responses require any significant changes to the current schema?
I have some ideas about the implementation, but wanted to validate these core aspects first to make sure I'm heading in the right direction. Happy to elaborate on any part of this!
Thanks for the help! Looking forward to your insights.
1/ what do your propose ? pros / cons. 2/ @kevvz could help ? but you should try it yourself first. 3/ just do it (to know that you're wrong 😏) no worries man, be creative. let's finish the first draft quickly then we have something to discuss. embracing the changes( good changes :))
@Sanchay-T
Deepgram also has TTS so you could use the same sdk I think that the speech to text is using. Not sure if Omi has a preference there tho
see: https://pub.dev/packages/deepgram_speech_to_text#text-to-speech
Removed @Sanchay-T from assigned - no progress
@beastoin let's try to not assign people if they didn't yet have PRs submitted. We assign only to those who had PRs. Others will need to do a PR first.
@DamienDeepgram try it out bro - looking forward!
Hi @beastoin and @kodjima33
I wanted to clarify the situation regarding my previous assignment. First, I apologize for the delay in updates - I was away for Diwali celebrations in my hometown, which affected my response time. However, I want to assure you that I've been actively working on this in the background:
- I've been going through the codebase thoroughly, particularly focusing on the audio pipeline and BLE integration
- While I have less experience with Flutter/Dart specifically, I bring relevant experience with speech/text models which I believe will be valuable for this feature
- I'm currently working on implementing a proof-of-concept to address the questions I raised earlier, particularly around:
- Button press handling for voice command triggering
- Audio playback implementation
- Memory system integration for voice responses
I understand the policy about assignments and PRs, and I'm committed to submitting a PR with my implementation soon. I should have communicated my temporary absence better, and I appreciate your patience.
Would it be alright if I continue working on this feature and submit a PR for review? I'm happy to share my current progress in more detail if helpful.
Thanks for understanding!
checking speaker functions of current firmware...
Removed @Sanchay-T from assigned - no progress
@beastoin let's try to not assign people if they didn't yet have PRs submitted. We assign only to those who had PRs. Others will need to do a PR first.
@DamienDeepgram try it out bro - looking forward!
Hi @kodjima33 What if we add a separate option to route audio to the phone's output, like AirPods? I think this would be another option for users, as they could listen privately.
@DamienDeepgram don't forget to ref your PR ;)
about the preference to implement this task, be creative. but i think Nik's description is good/simple enough to roll out l the first draft. smth likes ~
1/ the user press the button in the device and say something 2/ the device send that voice to the app 3/ the app send the voice message to the backend 4/ the backend process the voice message then response to the app with audio bytes 5/ the app send the audio bytes to the device 6/ the device speak it out loud.
hope that helps.
speaker should support playback over BT, i dont might looking into this after apple watch PR as it will use a similar two-way transport.
I have started on this, but need to get some sleep #1243 - i think even before i flashed new firmware any button click on my (red) devkit2 causes a fatal crash - not sure if the shipped devkit2's have a different setup with button? Could be a different pin/setup for the button causing this issue.
Code in WIP includes all the BLE setup to stream and handle the stream on desktop side (Python). Once finalised can move to dart code.
@Sanchay-T bro no worries, just keep building this and try to make it work. No one blames you - it's just we assign issues only after first PR
@vincentkoc @DamienDeepgram guys I believe in you. Let's make this work! (ideally today)
You are both working on this, if you both make it work, I'll make a post about both of you and we will solve the bounty issue
fighting 💪
@DamienDeepgram don't forget to ref your PR ;)
Sorry yes here is the PR with the issue with playback not streaming correctly
https://github.com/BasedHardware/omi/pull/1246
Hey @kodjima33 @beastoin Is this issue been resolved? I've gone through discussions and am willing to develop this, may I proceed?
Please go a head and keep us updated @ombhojane
Sure @beastoin
@beastoin I'm stucked at setup the project. With Omi's instructions, I did setup, at last stage it was building android gradle files, it downloaded more data than expected. So more time was gone in setting up the things, and still figuring out. Need to see what's going on and how to fix.
So how do we integrate this feature if we do not have Omi dev kit devices? Or, can we implement this feature in our Android/iOS device, and if it works there, then it works with OMI devices?
@himmat12 you already know the answer man. the ticket title is super clear.
@ombhojane how's it going?
if you want to get this tiket done - building the app / the firmware is a basic requirement.
Hey @beastoin I'll figure this out today, yesterday was my exam. I'll try manual installation once.
Hii @beastoin I've set up the Omi. Now I'm looking to fix the issue, I'll update the progress