tjbot icon indicating copy to clipboard operation
tjbot copied to clipboard

Remove support for camera due to Visual Recognition API not being available anymore

Open robertoetcheverryr opened this issue 3 years ago • 11 comments

As the title says, May I make a PR removing the config.js option for the camera or at least adding a comment that it's not available anymore? And the same with the ibm-credentials.env file? Maybe delete the switch case regarding the camera?

robertoetcheverryr avatar Oct 10 '21 14:10 robertoetcheverryr

@robertoetcheverryr I originally decided to leave it in just in case someone had an already-provisioned visual recognition service, but the likelihood of this keeps going down over time. That said, I'm not fully convinced we should remove the camera support, as it's a huge part of what makes TJBot fun! But, it's not clear to me what the other options might be. Are there any vision models we might be able to run locally on the Pi?

jweisz avatar Oct 21 '21 14:10 jweisz

I haven't used it yet, but Google has a Vision API with 1000 monthly queries. I'm not sure if that works with the free account or with the "you must put your CC and only then you get the free tier"...

robertoetcheverryr avatar Oct 22 '21 00:10 robertoetcheverryr

We would not able to officially support the use of Google APIs with TJBot.

jweisz avatar Oct 22 '21 12:10 jweisz

I do have an already-provisioned visual recognition service provisioned, but it seems to have stopped working. I'm having trouble figuring out why. The reply tjbot returns says: Error: <HTML><HEAD><TITLE>Error</TITLE></HEAD><BODY> An error occurred while processing your request.

Reference #30.713a2f17.1639419281.1a1a60f6

Do I have any hope of working around this?

andycitron avatar Dec 13 '21 18:12 andycitron

Likely not. There's no support for the Watson Visual Recognition service anymore as it's been discontinued, and it seems I really should go ahead with removing it from the TJBot library. That said, I haven't yet found a viable replacement, since I'd hate for TJBot to lose the ability to see. 😢

I'm definitely open to someone submitting a PR to replace Visual Recognition with something else. Preferably something on-device, though that might up the hardware requirements...

jweisz avatar Dec 14 '21 15:12 jweisz

Justin, I reimplemented my tjbot code using Microsoft Azure visual services. I uploaded the code fragments to github in case you were interested in incorporating it into your tjbot node js implementation. You can find it here: https://github.com/andycitron/tjBotFragmentThatUsesAzureVisualServices

Note that it does introduce a dependency that the user has a Microsoft Azure account.

Also, if you want to incorporate it into tj.see(), you'd want to structure it a bit differently. tj.see() takes a photo. I did not want my Microsoft functions to have a dependency on tj.takePhoto. So I put the 'take a photo' part into the intent processing for 'see' and passed the photo into the code that uses Microsoft functions.

The code I put out there includes additional methods that invoke Microsoft facial recognition. That is not part of the 'tj.see()' functionality. That code requires pre-training of the facial recognition models. I included that because it might be useful to someone.

andycitron avatar Dec 24 '21 17:12 andycitron

Hey @andycitron, happy new year. :)

Thanks for the effort you put into TJBot, this is a really great contribution. Unfortunately, I won't be able to make this part of the official library because it uses a competitor's cloud service. But, I will put this on our Featured Recipes page to showcase your work.

jweisz avatar Jan 04 '22 15:01 jweisz

Cool. Yesterday I posted a video to Youtube illustrating how TJ works with Azure. Perhaps you want to include the 4 minute video along with the featured recipe: https://youtu.be/B92efwFqXSs

Could you give me the link to the Featured Recipes page where you referenced my code? I'd like to include a link to that on my home page. thx.

andycitron avatar Jan 05 '22 02:01 andycitron

Here's the link: https://github.com/ibmtjbot/tjbot/tree/master/featured#microsoft-azure-visual-services-by-andycitron

jweisz avatar Jan 05 '22 14:01 jweisz

Justin, Sorry to bother you, but I can’t figure out where to ask this question. It’s not a tjbot issue, just a question.

Where does tjbot store the audio file it uses for speech to text? What format?

I see that Microsoft Azure has ‘person voice recognition’ and I’m thinking about trying that out. Seems it wants a wav audio file as input.

My tjbot gets confused when multiple speakers are talking at the same time. Those utterances usually end up being ignored by my implementation. But every once in a while it’ll try to respond. Because I know who I’m talking to (facial recognition) I can ignore utterances from a different person….or at least I can try.

Do you know the answer? Or can you tell me the proper place to ask this.

I actually think an implementation using multiple microphones and detecting voice based on location in the room makes sense, but that seems very hard.

Thx, Andy

From: Justin Weisz @.*** Sent: Tuesday, January 4, 2022 10:40 AM To: ibmtjbot/tjbot Cc: andycitron; Mention Subject: Re: [ibmtjbot/tjbot] Remove support for camera due to Visual Recognition API not being available anymore (#180)

Hey @andycitronhttps://github.com/andycitron, happy new year. :)

Thanks for the effort you put into TJBot, this is a really great contribution. Unfortunately, I won't be able to make this part of the official library because it uses a competitor's cloud service. But, I will put this on our Featured Recipes page to showcase your work.

— Reply to this email directly, view it on GitHubhttps://github.com/ibmtjbot/tjbot/issues/180#issuecomment-1004915870, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AIG4X2MDFOYAVO4Q4IXE52LUUMIFLANCNFSM5FWRME3A. Triage notifications on the go with GitHub Mobile for iOShttps://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Androidhttps://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub. You are receiving this because you were mentioned.Message ID: @.***>

andycitron avatar May 30 '22 15:05 andycitron

Hi @andycitron -- take a look at tjbot.js:792, where listen() is defined: https://github.com/ibmtjbot/tjbotlib/blob/4fe0263bd0050f910752ae589d3b33cdb9cb93ae/src/tjbot.js#L792

The audio isn't stored locally, the data is streamed through a pipe between the microphone and a web socket. So it would need some modification to save the output to a file first, before uploading to the Microsoft service. Maybe check to see if their API supports WebSockets?

jweisz avatar May 31 '22 15:05 jweisz

Closing as this is now an issue in the tjbotlib repo: https://github.com/ibmtjbot/tjbotlib/issues/73

jweisz avatar Jan 13 '23 15:01 jweisz