intents icon indicating copy to clipboard operation
intents copied to clipboard

[EN] Match articles "a" and "an" for <the>

Open tannisroot opened this issue 1 year ago • 4 comments

Whisper (for me) seems to always put article a after vacuum commands start or return. Someone might also actually say it this way so let's handle that.

tannisroot avatar Feb 25 '25 07:02 tannisroot

First of all, I don't think it's a good idea to prefix <name> (which can contain <the>) with an indefinite article. You would be able to say start a the roborock, which doesn't make grammatical sense.

Second of all, if we're doing this, I don't see why there wouldn't be the an form as well in there.

Third, I don't see why this would apply strictly to vacuums and not every other entity.

Fourth, to counter the the issues above and create new ones, why not add a[n] to <the>?

Finally, like I said numerous times before, I don't think it's wise to add incorrect sentences just to please Whisper or any other STT. The proper solution here would be to fix Whisper.

I'd like to hear the other language leaders' comments on this.

tetele avatar Feb 25 '25 08:02 tetele

First of all, I don't think it's a good idea to prefix <name> (which can contain <the>) with an indefinite article. You would be able to say start a the roborock, which doesn't make grammatical sense.

Second of all, if we're doing this, I don't see why there wouldn't be the an form as well in there.

Third, I don't see why this would apply strictly to vacuums and not every other entity.

Fourth, to counter the the issues above and create new ones, why not add a[n] to <the>?

Finally, like I said numerous times before, I don't think it's wise to add incorrect sentences just to please Whisper or any other STT. The proper solution here would be to fix Whisper.

I'd like to hear the other language leaders' comments on this.

Oh I agree on most of this, and I can change the PR to have a[n] instead of just for the vacuum, I just thought it would an issue to have a change that would affect other commands. I would be more than happy to add the change directly to if that's what would be preferred. And yes in my case it is fixing Whisper, but since Whisper is, like LLMs, a statistical model, it means in the data it was trained on, people would often say it with "a", or at least say it in a way that would make it sound like an "a". I am certainly guilty of mushing the "the" in such sentences in a way that makes it sound almost like "a". It's not grammatically correct, sure, but the point of intents is to understand all people, not just people with good grammar knowledge, good pronunciation or the right accent/dialect, and it seems innocent enough?

tannisroot avatar Feb 25 '25 11:02 tannisroot

Also, as much as I would love to fix Whisper to follow grammar, but there is so much you can do with it to influence the output. OpenAI probably trained it on mega powerful datacenters with all the speech data they sucked off the internet, it's not really realistic to be able to somehow fix all the edge cases like this without the expertise and resources they had. And yes, there is speech-to-phrase as an alternative, but in my testing bigger Whisper models are far, far better at understanding noisy, imperfect audio that you typically get out of assist satellites, as well as telling apart "turn off" and "turn on", so I am afraid it is here to stay for those who need local STT.

tannisroot avatar Feb 25 '25 11:02 tannisroot

I went ahead and added "a[n]" directly to , since adding it to the intents themselves leads to wonky matching.

tannisroot avatar Feb 26 '25 04:02 tannisroot