willow
willow copied to clipboard
Sometimes it's necessary to repeat the wake-up word before Willow wakes up
I have two devices running Willow built from a repo I cloned on June 11 (is there a better way to specify what version I'm running?). Each device is in a completely separate part of the house (1st floor kitchen and lower level family/rec room).
I only ever use the wake word "Alexa".
After not using the device for a while (sometimes as little as a couple hours), I find that I need to say "Alexa" two or more times before I see the screen turn on.
There is very little ambient noise in these rooms. In the kitchen you can barely hear the refrigerator running. In the family room, there's basically nothing generating any ambient noise.
I experience the problem even when I'm quite close (2-3 feet) to the devices. For example, in the kitchen picture below, I can be standing at the counter top directly in front of the device.
Here are pictures so you can see the environment in which the devices are located.
Kitchen
Family/Rec Room
Just chiming in here, I noticed today both my willows will not respond to wake at all after sitting 24 hours idle. I had to power cycle them in order for them to start listening again.
When @mhilbush first reported this on the openHAB forums I related that I haven't seen this issue personally. I have an ESP BOX in my bedroom only used to turn off the lights when I go to sleep and there are roughly 24 hours in between voice commands.
That said, I have some devices with storage of logs that are in an acoustically isolated environment for testing. I've flashed them with full debug builds to try to reproduce this with a more scientific approach to "24 hours in between commands/activity". My first attempt will be three hours in between commands and I'll step it up from there.
Stay tuned!
When @mhilbush first reported this on the openHAB forums I related that I haven't seen this issue personally. I have an ESP BOX in my bedroom only used to turn off the lights when I go to sleep and there are roughly 24 hours in between voice commands.
That said, I have some devices with storage of logs that are in an acoustically isolated environment for testing. I've flashed them with full debug builds to try to reproduce this with a more scientific approach to "24 hours in between commands/activity". My first attempt will be three hours in between commands and I'll step it up from there.
Stay tuned!
Just for some more info, the devices still replied to WAS commands; I basically rebooted them remotely by doing a config push, and that seemed to bring them back to life. This is also the first time I have noticed this; I am fairly certain I have left them for 24 hours or more without a command before and they worked fine when I said the wake word the next day, so not sure what specific scenario caused this.
I assume @mhilbush is running main - @mhilbush can you confirm?
feature/was represents a TON of development with significantly less testing. I'm not necessarily surprised to see this issue reported there but it occuring with main is very surprising.
So crazy thing, it just happened to me again just now. Sitting idle for probably 3-ish hours, won't respond to wake word. Is there some way I can store logs or pull log data from it to see if anything is there?
Correct, I'm running off the main
branch.
Sitting idle for probably 3-ish hours
Yes, I've seen it happen after being idle for less than an hour. In fact, I've been trying to narrow down the amount of idle time that causes it to miss the first Alexa. And, interestingly, I have seen it happen when it's been idle even for a few minutes. And I'm also now trying to change some attributes of how I say Alexa (e.g. speed, tone).
I would love to know what's happening after I say Alexa the first time when it doesn't wake up.
Sitting idle for probably 3-ish hours
Yes, I've seen it happen after being idle for less than an hour. In fact, I've been trying to narrow down the amount of idle time that causes it to miss the first Alexa. And, interestingly, I have seen it happen when it's been idle even for a few minutes. And I'm also now trying to change some attributes of how I say Alexa (e.g. speed, tone).
I would love to know what's happening after I say Alexa the first time when it doesn't wake up.
Just to be clear, in my case it becomes completely unresponsive, meaning I can repeat the wake phrase dozens of times and it will never respond. Also will not wake when touching the touchscreen.
in my case it becomes completely unresponsive, meaning I can repeat the wake phrase dozens of times and it will never respond. Also will not wake when touching the touchscreen.
Yes, that's very different from what I'm seeing. I've never had the device become completely unresponsive like that. Usually after 2 attempts, and sometimes after 3 or 4, it wakes up.
@mhilbush - In terms of variation on wake pronunciation, etc to further track this down my sealed testing environment uses recordings played out via a speaker to disambiguate between any potential wake/audio/environment issues and other potential software issues. I'm going to give it a full three hours at idle and attempt to reproduce this in roughly one hour. This environment also runs debug builds with full log capture enabled.
@nikito - Between last night and this morning I observed what you are describing. Interestingly, it only happened on my Box test device and not the Lite so that's potentially a hint in the right direction. Then again could have just been a fluke. More testing should help clear this up.
@mhilbush - In terms of variation on wake pronunciation, etc to further track this down my sealed testing environment uses recordings played out via a speaker to disambiguate between any potential wake/audio/environment issues and other potential software issues. I'm going to give it a full three hours at idle and attempt to reproduce this in roughly one hour. This environment also runs debug builds with full log capture enabled.
@nikito - Between last night and this morning I observed what you are describing. Interestingly, it only happened on my Box test device and not the Lite so that's potentially a hint in the right direction. Then again could have just been a fluke. More testing should help clear this up.
It actually happened to me again an hour or so later. Pushing config from WAS again fixed it. So seems somehow something in the whatever code is waiting for the wake word seems to lock up or something to that effect? Haven't tried to debug or anything (not really sure how to live debug the Boxes 😆 ) I suppose I could try to plug one into my server and monitor the serial port to see if it notes anything weird before it happens?
Strange but I'm actually relieved to hear it's repeatable after an hour as opposed to 24!
Yes, if you can do a debug build with connection to serial and record logs that is very helpful (I'm taking the same approach).
Strange but I'm actually relieved to hear it's repeatable after an hour as opposed to 24!
Yes, if you can do a debug build with connection to serial and record logs that is very helpful (I'm taking the same approach).
Something else I noticed, when it first boots it seems VAD doesn't work on the first command; I say something like "What time is it?" and it takes up to the 5 second timeout to then acknowledge the command and reply. If I repeat the command the VAD is near instant from then on for all other commands. Not a huge deal, but figured I'd mention it. 😄
So now I have an audio recording of me saying Alexa. I'm doing some tests with it now. Leaving several minutes between each attempt. It's bizarre. I'm not changing the location of the device or my phone. There's little to no background noise.
- attempt 1: woke up the 3rd time I played it
- attempt 2: woke up the 1st time
- attempt 3: woke up the 2nd time
- attempt 4: woke up the 1st time
- attempt 5: woke up the 1st time
- attempt 6: woke up the 5th time
BTW is there an Android app would you recommend for capturing audio?
Edit: Here's a pic of where I am in relation to the device. I'm about 5 to 6 feet from it. Basically, I hold the phone near my head and play the audio. If it doesn't wake up, I wait a few seconds, then play it again.
So quick update, since turning on Debug and deploying to both my ESP Boxes, neither has had the issue now. If I don't notice it happen again I may flip one of the boxes to a non-debug build for laughs, just to see if that somehow makes a difference. 😆
@nikito it is actually quite common to see problems that don't manifest with debug on or etc on this type of device of course. It tends to point to something timing related?
Seeing some interesting numbers in the debug:
Not sure if these are normal or if it is doing something odd 😄
@nikito - VAD on first boot is a known issue. We think it has something to do with audio init and the audio front end framework calibrating/settling/something but we haven't explored it further as it's very low on our list of priorities.
You are seeing some FUNKY numbers from that debug output... I have no idea what's going on there.
I can also confirm with my testing today with three separate attempts in my controlled environment spaced by three hours each with debugging enabled the issue does not occur. I'm going to do latest regular builds and flash with those to see if I can trigger it with the same time spacing (although I'm not sure how helpful the output will be). At least we're running WAS so OTA is painless :).
@hamishcunningham - Yep, unfortunately. We will see when I run my non-debug builds but depending on debug options selected they run additional tasks that will keep the device warm(er). Although with wifi management, SNTP, etc it's not as though the device ever sleeps or similar but it will be a good datapoint regardless.
@mhilbush That is bizarre and a terrible experience. What I can tell you from my experiences recording and playing back prompts it needs to be as high fidelity as possible. In my case I record on my desktop with my Logitech C920 microphone as the source in 48kHz audio. I then play it back with no resampling, conversion, etc. We have HIGHLY accurate and repeatable results with this process - our routine "torture testing" is 1000 repetitions of a couple of different recordings. The last test run like this woke, captured speech, and successfully executed the HA command 996/1000 times on the ESP BOX.
I don't use Android so I can't recommend a good app but the other potential issue is that many devices, apps, etc do their own front end audio processing which could result in a lack of fidelity on playback from the perspective of the ESP BOX. Additionally, the playback hardware factors in as well (of course) and the frequency response of the speakers, etc from the device could be a factor as well. That said you're having terrible results with your own speech so it may not be a factor at all.
That said you're having terrible results with your own speech so it may not be a factor at all.
Correct. I only switched to a recording (which I discovered is 32kHz) to remove the variable that my own speech might be inconsistent from one try to the next. I see no difference between using my own voice and the recording of my voice.
At this point I'm really at a loss as to what to do next. If the wake word isn't reliable for me, everything else is pretty much pointless.
I'm starting to wonder if there's something wrong with the lot of devices from which I purchased my 2.
I'm not sure where you're located, but would you like me to ship you one of my devices?
@mhilbush We've placed so much emphasis on wake word because of exactly your point - wake recognition is the equivalent of the power switch. If it doesn't activate, it is useless. So I completely understand your frustration there.
It's unlikely you have two bad units. From what I can remember I think we've only had one unit across all users which turned out to be defective. In terms of hardware the biggest issues seem to be power supplies and cables.
I should note that Willow from concept to now is just under three months old with the first developer "release" being roughly six weeks ago. It is extremely young and wake word, far field audio, speech recognition, etc for every speaker in every environment is very difficult. Alexa is eight years old and other open source projects in the space like Rhasspy are several years old. I am very confident that with time, testing, and additional development we will achieve our goals of being a great voice user interface for everyone. Time being weeks/months - not years. I understand that doesn't help you now but I wanted to add some context.
I think you can appreciate that if your experience was typical we wouldn't have gotten anywhere and our GH issues would be full of hundreds if not thousands of these kinds of reports by now.
I think it's still a bit too early in exploring this issue to take the drastic step of shipping a device but I appreciate the offer.
I just did another 2.5 hour idle test across box and box lite, woke right up immediately and successfully executed the command as expected. @nikito - this was with only debug logging turned on (no task/mem printing, etc).
@kristiankielhofner yeah I noticed my devices with debug, even without monitoring, seemed fine all day. As a test I left one on debug build and set another to a non-debug build. I'll check both tomorrow and see how they fare. 😄
Update this morning, left the devices idle all night, one on debug and one with no debug turned on, and they both responded and worked fine this morning. Really not sure what the issue would be, maybe reflashing somehow fixes it? 😆
@nikito - same here. Like any transient issue good and bad - it "fixed" itself for unknown and speculative reasons. Of course depending on underlying cause it may very well come back on us... Regardless of the short-term outcome we'll definitely leave this issue open for a while in the event it comes back in the near future.
@mhilbush - Somewhat embarrassingly it took me way too long to come up with another debug step on your issue. Can you provide your existing recordings? If you're okay with that you should be able to upload them here or provide a link. Additionally, going the other direction, the recordings I use (of my own voice) for the testing I have described are in tree at misc/*.flac. You may not have the associated command endpoint to actually execute the command(s) but when you play them back it will likely provide additional insight on your issue of failure to wake by separating voice from environment and specific unit.
@kristiankielhofner No problem. I have one recording only - just the wake word Alexa. 😉 Since I couldn't wake it up reliably, I haven't created any other recordings. And, I wanted to find an app that would let me have more control over the audio quality.
Here's a link: https://drive.google.com/file/d/1Xo1E20ONlCbyOPRQLiHK7vapyxIBZuUT/view?usp=sharing
I should note that Willow from concept to now is just under three months old with the first developer "release" being roughly six weeks ago. It is extremely young and wake word, far field audio, speech recognition, etc for every speaker in every environment is very difficult.
I understand that Willow is very new. And please don't get me wrong, I'm super impressed with what you've been able to accomplish in such a short time. But perhaps there's something I don't quite understand. I thought the wake words "Alexa" and "Hi ESP" were part of the Espressif platform, not part of Willow. Possibly just my misunderstanding. And, if they are part of the platform, I certainly don't know when or how well that aspect of the platform was created.
I came from the "main"=-branch, meaning flashing willow from a connected server. No noticible errors / blanks screens. Last week I installed and run WAS on the same server where I has WIS running. After I flashed three ESP-BOXs and one -Lite, the units would all just freeze up with black screen. There is no specific time elapse before it happens again - it seems to be random. A power-cycle is the only thing that makes them respond again.
In the mean while WAS does not report them to be disconnected from the server.......But the screens are blank and not responding.
I came from the "main"=-branch, meaning flashing willow from a connected server. No noticible errors / blanks screens. Last week I installed and run WAS on the same server where I has WIS running. After I flashed three ESP-BOXs and one -Lite, the units would all just freeze up with black screen. There is no specific time elapse before it happens again - it seems to be random. A power-cycle is the only thing that makes them respond again.
In the mean while WAS does not report them to be disconnected from the server.......But the screens are blank and not responding.
I believe this is a known issue, but isn't yet deployed to the builds WAS is using. You would need to follow the steps to pull willow from the feature/was branch and manually create a build for local OTA and push. It would look like this:
Sorry for my stupidty - a total NOOB here with github and 'pulling a build".
This is what I see next to my only two connected ESP-BOXs:
When I click on the OTA button next to the ESP-BOX, the ESP-BOX get's flashed (or it cycles power) and does not appear under WAS anymore. How do I "pull willow from a feature/was branch" and manually create a build?
As I said earlier, I flashed the "main branch" to one ESP-BOX and the -Lite and they have been up and running without "blanking out" for a few hours now. But they also does not show up under WAS and the TTS responses are not working as in the WAS builds I had.
This would involve building the willow code as outlined on the willow repo (would have to git pull feature/was branch) and then following the steps on the was repo to copy the build into the was container. This is all stuff that will go away for regular users with 1.0 release, most of these activities would only be done by devs 😄