smart-home-nodejs icon indicating copy to clipboard operation
smart-home-nodejs copied to clipboard

OPEN_AUTH_FAILURE

Open akshar001 opened this issue 5 years ago • 23 comments

I know this is a long time issues and is still it's happening to almost most of the users. I have searched through all the web to find a solution but still didn't get any. The reason is there is no proper documentation addressing to this issues. I know people have come up with bad oauth2.0 server implementation sometimes they just mess up. But we all know that in the end this a feature is a great thing from google.And people are going to try this great thing. I was using same format for three months. Before three months it was working fine. Now when i try to do account linking this error is showing up in stackdriver.

{ insertId: "1l5phc0f4bx2mq"
logName: "projects/dot-vegg/logs/actions.googleapis.com%2Factions"
receiveTimestamp: "2019-06-08T04:32:52.627529629Z"
resource: { labels: { action_id: "SMART_HOME_SYNC"
project_id: "dot-vegg"
version_id: ""
} type: "assistant_action"
} severity: "ERROR"
textPayload: "SYNC: Request ID 15740549287276813436 failed with code: OPEN_AUTH_FAILURE"
timestamp: "2019-06-08T04:32:52.591737495Z"
} Yet again. In Oauth2.0 playground everything is working fine. So i want to know where the problem is. This issue should be like thread. I have seen many people has same kind of issues and somehow might get solved it. Please write down all of the things that you have done to solve this issue in order to understand where the problem is. i found one Chinese written solution and this is what it says.

  1. He uses JWT type of access tokens instead of simple bearer token. So it is necessary to use JWT bearer token or simple access token would work?
  2. There are some suggestion to logout all the accounts and get it with only one but it is not working also.

And yes there is no sync request is getting from my side. I tried google support by mail but in the end i was told to figure this thing by myself. So let's stop this thing now altogether by properly discussing where the problem is. It can save much times of others.

akshar001 avatar Jun 08 '19 04:06 akshar001

Getting an OPEN_AUTH_FAILURE means there's some issue with your OAuth integration. Unfortunately, this is a broad challenge and one that may not be easily debugged due to the sensitive nature of user accounts.

A simple access token should work as the final response. In this sample you can see how I use hardcoded tokens for access. I also wrote a blog post about using Auth0. Beyond that, one would need to do debugging on their side to see if tokens match.

Fleker avatar Jun 10 '19 15:06 Fleker

Thanks @Fleker, I have seen that you have been replying this type of issues from 2 years now, And glad that you still answers them, The problem is when i run oauth2,0 google playground it all works fine. Even i can refresh the token, But when we uses in app for smart home the same thing doesn't works out. This is one thing that amazon provides https://developer.amazon.com/blogs/post/TxQN2C04S97C0J/How-to-Set-up-Amazon-API-Gateway-as-a-Proxy-to-Debug-Account-Linking If google can provide the same thing it will solves hundred of pepole. P.S:- I also have read the blog it's very nicely written. Though it can not help to debug our code. And yes the same oauth2.0 server i am using for Amazon alexa and it's working fine.

akshar001 avatar Jun 11 '19 08:06 akshar001

Hi @Fleker ,

I'm having same problem. After digging for few days and checking logs on my server and Google Stackdriver, I think I have pinpoint now.

Problem is when user's access token expires on Google's server then Google asks for new access token using refresh token. At that step, it get error. As there is no docs available on how Google uses refresh token so I'm unable to move further.

Just for your information, I'm following guidelines from Auth0 for using refresh_token https://auth0.com/docs/tokens/refresh-token/current

My application working fine for Alexa and SmartThings and user have no problems refreshing their tokens. But it is not working for Google Smart Home project.

Kindly help on how Google use to get new access token using refresh token.

mithoog avatar Jul 05 '19 22:07 mithoog

Google doesn't do anything custom with regards to getting refresh tokens.

Fleker avatar Jul 09 '19 18:07 Fleker

What I see (and just verified again) is that Google does not proactively refreshes the session. For example, I had an Oauth2 session that expired after 10 minutes, (the mobile apps was in the screen of a a simple device such as wall plug) but Google had not refreshed even after half an hour it was expired (set by expires_in). Once I clicked on one of the devices control in the app, google issued the refresh token to obtain a new access token. I am sure that other services, such as Smartthings proactively refresh about 1 minute or so before expiration of the token even the GUI is not active in the mobile device. I remember to have read somewhere that in general services should proactively refresh. To be safe, an Oauth2 service (mine does) probably should provide a grace period (days,weeks,months?) before invalidating an refresh/access token or make the refresh token perpetual. I also send a "reportState" after the token expired, but even then no request for a refresh to the Oauth2 endpoint is made. A RequestSync does result in an refresh_token request.

My two cents...

symdeb avatar Aug 16 '19 08:08 symdeb

Appreciate all the feedback on this.

@mithoog

Problem is when user's access token expires on Google's server then Google asks for new access token using refresh token. At that step, it get error.

Since the access token request is typically triggered by an intent that comes in after the token expires, there is often a very short time between asking for the new token and the request that has the new token in it. You might verify that your server is not encountering a race condition where it's trying to verify the intent request against the old token.

@symdeb

I am sure that other services, such as Smartthings proactively refresh about 1 minute or so before expiration

Many OAuth server implementations do not generate a new access token until the current one has expired (they will simply return the same token while it's valid), so an implementation like this would realistically still need to wait for the expiration time. Proactive refresh is an interesting option to explore, though.

To be safe, an Oauth2 service (mine does) probably should provide a grace period (days,weeks,months?) before invalidating an refresh/access token or make the refresh token perpetual.

In general, you should only invalidate a refresh token under conditions where you want to require the user to re-authenticate. If you revoke the refresh token given to Google, the user will be forced to relink their account. Practically speaking, this means the refresh token should almost never expire.

devunwired avatar Aug 16 '19 13:08 devunwired

Hi Akshar001... did you find a solution for this? I have the same problem

javiercuellar73 avatar Oct 09 '19 14:10 javiercuellar73

@javiercuellar73 No i still haven't. Unless google helps us to set up a debug process. It would be hard for us to find out.

akshar001 avatar Oct 15 '19 10:10 akshar001

I have a similar problem. But not with the account linking. For me it occurs once a day (after a long pause - N hours). I did everything that @Fleker described in this guide in terms of auth0 integration.

In stackdriver logs I see the same picture. But there are 2 records. First one is mentioned in the initial post (the difference could be only in actionId, depending on the request you make, and PID). The second record is the following:

{
 insertId: "134lmz4frpl0c2"  
 logName: "projects/PID/logs/actions.googleapis.com%2Factions"  
 receiveTimestamp: "2019-10-27T19:08:58.489093607Z"  
 resource: {
  labels: {
   action_id: "SMART_HOME_EXECUTE"    
   project_id: "PID"    
   version_id: ""    
  }
  type: "assistant_action"   
 }
 severity: "ERROR"  
 textPayload: "requestId 12540743835849591185: Agent responded empty JSON."  
 timestamp: "2019-10-27T19:08:58.483968995Z"  
}

I'm using Google Home speaker. And it says that my smart home action cannot be reached. I don't see any requests to my backend in logs. I can't interact with devices from within Google Home app as well. So it definitely fails while Google is trying to access auth0 layer.

Btw, manual sync requests fail as well with some strange 403 errors. The only way how I could fix it is account re-linking.

I'm testing smart home action for 3 days so far. And each day I see this picture after a long idle period. So it's obvious that the reason is in tokens' expiration. However, I don't understand why Google can't refresh them automatically.

I can compare this setup with Alexa Skill and Amazon oauth integration. The process is pretty similar: one side contains client id / secret, auth / token URIs; the other side - allowed return URLs. However, when we link an account in Alexa app, it lives forever. There's no need to think about some manual tokens' micro management. But in case of Google, it seems like there's either a bug or some auth0 misconfiguration in the provided guide.

Looking forward to receive some assistance on this.

P.S. Google has a very big advantage in comparison with Amazon (in terms of smart home skills / actions). It doesn't force using cloud functions as a backend (Amazon forces to use lambda function). As a result, Google Home works much faster than Alexa. So for me (Alexa user) there's only a single showstopper in terms of migration - this pretty annoying oauth issue.

sskorol avatar Oct 27 '19 20:10 sskorol

Tokens are refreshed automatically as-needed. But it could be that the test state for your Action expired, and you can restart that state in the Actions Console.

Fleker avatar Oct 29 '19 14:10 Fleker

I found this old but interesting thread. There were several cross-references between Auth0 and Google communities that allow to make the following conclusions:

  • neither Google nor Auth0 really wanted to negotiate this issue together;
  • there's no proof that it's fixed, as the issue was closed when an ugly workaround was found;
  • after a number of attempts (following the author's hints) I found that a workaround with global audience setup does work;
  • original guide doesn't meet all Auth0 requirements: 1) missing offline_access scope for getting refresh tokens; 2) missing audience configuration, which can be set only globally as Google doesn't allow passing audience to authorize endpoint.

If it's already fixed on Google side, would be appreciated if you could provide any reference.

In the meantime, it's now obvious that the issue is related to the fact that Google can't actually update a refresh token (which by default expires in 24 hrs, unless you create an Auth0 API with machine2machine app type, and configure this timeout manually), because required scope and audience is not set.

For those who still struggling with similar issues between Google and Auth0, here's a workaround:

  • follow this guide to create an API and machine2machine app;
  • follow @Fleker guide to configure a smart home action + callback / web origin URLs on Auth0 side (for machine2machine app);
  • add offline_access scope to the account linking section of your action;
  • copy your Auth0 API identifier and paste it as a default audience in your global Auth0 settings;
  • play with refresh token timeouts on Auth0 API level to check if everything works as expected.

sskorol avatar Oct 29 '19 20:10 sskorol

I don't use Oath0 as I have my own Oauth2 service. So in that case calling the RequestSync every day for each UserAgent might the the only option to trigger a request within the token expiration period. . BTW, In contrary to above posting that the service was unreachable , I do not experience that. I can abstain using the Home app for days and Google won't make any refresh call . Though Google will call the Oauth2 service rerfersh with the last (but on my side "officially" expired) refresh token once the app is used again. It does mean that on my side the token cannot be deleted and the OAuth2 session will stay here and never expire in the DB forever. Samsung SmartTings however does do "proactive refresh". Implementations between eco-systems differ a lot regarding Oauth2. Even having one standard there are different approaches.

symdeb avatar Oct 30 '19 03:10 symdeb

@symdeb for me RequestSync calls were constantly failing with 403 error after token expiration. But again, that seems to be Auth0-specific case, as if Google side hasn't met their requirements for the refresh tokens, you can't reach your backend at all.

sskorol avatar Oct 30 '19 08:10 sskorol

Hi @akshar001 , did you find a solution for this? I am having the same problem and I searched everywhere on the internet but I couldn't find any way to fix it.

WaldenLiang avatar Jun 16 '20 03:06 WaldenLiang

@WaldenLiang No i haven't but @sskorol suggested an interesting option to solve.

akshar001 avatar Jun 16 '20 03:06 akshar001

@akshar001 I have solved the problem just now. Follow the steps below and hope to solve your problem.

  1. Delete the firebase function and recreate a new one;
  2. Uninstall your old firebase-tools and reinstall the latest version;
  3. Re-deploy your function;
  4. Testing the linking account.

This method solves my problem perfectly, I hope it works for you.

WaldenLiang avatar Jun 24 '20 02:06 WaldenLiang

I experienced the same issue yesterday while setting up - this community example In order for my account to be linked and the sync to execute I had to allow unauthenticated access to the cloud functions for Authorization URL and Fulfillment URL as outlined here. I believe this this is supposed to happen implicitly for HTTP functions with a newer version of firebase-tools, but I have had to set it manually in the Cloud Console. @devunwired , does this seem accurate? Should unauthorized access to the 'allUsers' group be applied on the Fulfillment URL in additon to the Authorization URL functions?

nolandubeau avatar Jul 30 '20 12:07 nolandubeau

I have some people that have a similar issue... I have some more comprehensive logging around this and what i see for them is that multiple refresh token requests are firing within seconds of each other, and specifically the some overlap in the requests starting and returning. By chance do those of you with this issue have a daily routine setup that would update more than one thing through this integration? I'm wondering if the routine is running all steps in parallel, but isn't preupdating the refresh token, but instead each step is making a call to update and you get a race condition for the refresh token...

The only other thing i can think, is my examples show some long latency times (5 to 11 seconds), but I'm unsure if that's related, just the only other data point i have right now.

i8beef avatar Sep 01 '20 06:09 i8beef

Google did not call my service on a session created at the end of April. (did not use the Home app from months) I just use the Home app and Google called the for a refresh token. t(that is 4.5 months later) My service calls RequestSync at a regular bases. Oauth2 Services that do not do this (auth0?) may experience problems. hope this help, for me it did.

symdeb avatar Sep 04 '20 13:09 symdeb

What it looks like to me is that its possible for Google to make multiple simultaneous requests, and if the access token is expired at that point, that translates to multiple refreshToken requests simultaneously. Most OAuth providers are going to invalidate a refresh token on first use, so you have an inherent race condition here.

Looking at the request logs Im seeing, I worry Google might be doing something like

  1. First command runs, and sees it needs a refresh. Starts a token refresh and blocks other commands, but only for 5 seconds. It fails to respond in 5 seconds (guessing timeframe based on logs), and the OAuth provider invalidates the refresh token because it thinks it sent back a new one.
  2. Second command kicks off while first command refresh is still running, sees it STILL needs a refresh, and tries to refresh again.
  3. In the mean time, the first command DOES finish up, but because it didn't happen in 5 seconds, the new tokens are discarded. You are now unlinked.
  4. Second command, and all others, fails because the old refresh token is bad now.

You either need

  1. RefreshToken reuse (doesn't solve the issue, but moves the problem to only occurring at refreshToken expiration time)
  2. Refresh token grace periods where the OAuth provider will still accept the same refresh token for within some grace period. This should let the Google side at least complete the refresh token hand shake. Depending on how this is done (e.g. a single access token per client id rule), this could introduce a DIFFERENT race condition where a COMMAND might fail, but you at least wouldn't lose your entire app link...
  3. Google changes how they do token management on their side to disallow simultaneous refresh token requests (i.e., if one command initiates a token refresh, all other commands block until its done).

Without seeing how Google is managing tokens, this is all just forensic guessing... but maybe I've said something here that will convince a Google engineer to revisit here. Maybe we can work around it with some less than ideal OAuth settings (refresh token reuse or grace periods if supported by your provider), but my gut is telling me there's something in the Google token management that is weird here.

Ill attach some of the logs here from a user of my implementation that has the actual request order on it. I think he has some other blocking issues going on in addition, but the refresh token handshake reveals that the token endpoint got called twice attempting token refresh about 5 seconds apart, and the third request completely failed the auth check because the token was gone. Whatever is happening here, that third call DEFINITELY shouldn't have been using the same refresh token as BOTH of the previous refresh token requests finished a second or two before that, so Google's side SHOULD have had a different token at that point, but somehow doesn't. He's getting this consistently every other day, so he has a repeatable setup for the issue.

I know my implementation is NOT this app, but since you have people here with similar issues using proper OAuth servers, I figured I'd give you what I can...

[01:00:11 INF] Request starting HTTP/1.1 POST http://localhost:5000/connect/token application/x-www-form-urlencoded 180
[01:00:17 INF] Request starting HTTP/1.1 POST http://localhost:5000/connect/token application/x-www-form-urlencoded 180
[01:00:19 INF] Removing expired grants
[01:00:19 INF] Invoking IdentityServer endpoint: IdentityServer4.Endpoints.TokenEndpoint for /connect/token
[01:00:19 INF] Invoking IdentityServer endpoint: IdentityServer4.Endpoints.TokenEndpoint for /connect/token
[01:00:23 INF] Token request validation success, {"ClientId": "***REDACTED***", "ClientName": "Google Actions Client", "GrantType": "refresh_token", "Scopes": null, "AuthorizationCode": null, "RefreshToken": "xRDqeShwiGOVe4dvUUJRnGsL6KaIU7UNbWeRhTRcR-U", "UserName": null, "AuthenticationContextReferenceClasses": null, "Tenant": null, "IdP": null, "Raw": {"grant_type": "refresh_token", "refresh_token": "***REDACTED***", "client_id": "***REDACTED***", "client_secret": "***REDACTED***"}, "$type": "TokenRequestValidationLog"}
[01:00:23 INF] Token request validation success, {"ClientId": "***REDACTED***", "ClientName": "Google Actions Client", "GrantType": "refresh_token", "Scopes": null, "AuthorizationCode": null, "RefreshToken": "xRDqeShwiGOVe4dvUUJRnGsL6KaIU7UNbWeRhTRcR-U", "UserName": null, "AuthenticationContextReferenceClasses": null, "Tenant": null, "IdP": null, "Raw": {"grant_type": "refresh_token", "refresh_token": "***REDACTED***", "client_id": "***REDACTED***", "client_secret": "***REDACTED***"}, "$type": "TokenRequestValidationLog"}
[01:00:25 WRN] Failed to remove token with key twkInXO9793B...***REDACTED***
[01:00:25 INF] Wrote tokens to config/tokens.json
[01:00:26 INF] Wrote tokens to config/tokens.json
[01:00:26 INF] Wrote tokens to config/tokens.json
[01:00:26 INF] Request finished in 16078.8725ms 200 application/json; charset=UTF-8
[01:00:26 INF] Request finished in 9171.0837ms 200 application/json; charset=UTF-8
[01:00:28 INF] Request starting HTTP/1.1 POST http://localhost:5000/connect/token application/x-www-form-urlencoded 180
[01:00:28 INF] Invoking IdentityServer endpoint: IdentityServer4.Endpoints.TokenEndpoint for /connect/token
[01:00:28 WRN] Failed to find token with key twkInXO9793B***REDACTED***
[01:00:28 WRN] Invalid refresh token
[01:00:28 WRN] Refresh token validation failed. aborting, {"ClientId": "***REDACTED***", "ClientName": "Google Actions Client", "GrantType": "refresh_token", "Scopes": null, "AuthorizationCode": null, "RefreshToken": null, "UserName": null, "AuthenticationContextReferenceClasses": null, "Tenant": null, "IdP": null, "Raw": {"grant_type": "refresh_token", "refresh_token": "***REDACTED***", "client_id": "***REDACTED***", "client_secret": "***REDACTED***"}, "$type": "TokenRequestValidationLog"}
[01:00:29 INF] Request finished in 125.6372ms 400 application/json; charset=UTF-8

i8beef avatar Sep 06 '20 20:09 i8beef

For those using Auth0, also, there's the fact that any refresh token reuse apparently invalidates THE ENTIRE TOKEN CHAIN... so if the above happened to you, it would not only reject the second command, but INVALIDATE any refresh token Google might be holding at that point, so the next time it has to use it, you'd be unlinked too. https://auth0.com/docs/tokens/refresh-tokens/refresh-token-rotation

Not all providers will do this, and that seems a bit heavy handed, but if the Google side isn't blocking correctly on this, it means its going to invalidate itself fairly regularly.

i8beef avatar Sep 06 '20 20:09 i8beef

This sounds like a fantastic discussion about possible improvement to the account linking infrastructure, in order to get it more visibility I would suggest continuing it the smart home public issue tracker as described in https://developers.google.com/assistant/smarthome/support, as this github issue tracker is for tracking issue about the sample itself (not the platform).

What do you all think?

proppy avatar Jan 21 '21 14:01 proppy

  • add offline_access scope to the account linking section of your action;

It should be obvious really but I didn't add offline_access originally so my implementation only worked shortly after linking and I saw OPEN_AUTH_FAILURE errors after that. Thanks to @sskorol for pointing out this scope.

potomato avatar Feb 27 '21 09:02 potomato