datahub icon indicating copy to clipboard operation
datahub copied to clipboard

AzureAD Reflection Token Refresh

Open c-thiel opened this issue 3 years ago • 6 comments

Hi @cccs-eric, I found another one while reflecting our AD: The Token that we fetch at the beginning has only limited validity and there is currently no refresh-mechanism built in.

I am crawling a very big AD which has been running for 1h now (I think, I will check the exact time tomorrow) when these messages start showing up:

Response status code: 401. Response content: b'{"error":{"code":"InvalidAuthenticationToken","message":"Access token has expired or is not yet valid.","innerError":{"date":"2022-04-19T15:43:51","request-id":"00000000-0000-0000-0000-0000000000000","client-request-id":"00000000-0000-0000-0000-0000000000000"}}}'

We should check the validity date of the token and add a refresh mechanism.

c-thiel avatar Apr 19 '22 15:04 c-thiel

Hi @c-thiel - This sounds to me like a feature request. Is there any way you're aware of to increase the expiration of the Azure Token?

jjoyce0510 avatar Apr 19 '22 19:04 jjoyce0510

Hi @jjoyce0510, It seems that this feature is currently in Preview in AD: https://docs.microsoft.com/en-us/azure/active-directory/develop/active-directory-configurable-token-lifetimes

At least for our AD I don't have the rights to add a policy - and I think this is the same for most other users as this needs to be done by AD Admins. Of course such a policy would also impede security a bit and thus always require a discussion.

By default the tokens are valid between 60-90 minutes. The exact validity is a random number in that interval. (Source: https://docs.microsoft.com/en-us/azure/active-directory/develop/active-directory-configurable-token-lifetimes#access-tokens)

I guess this is a feature request yes. I don't think I can change the label.

c-thiel avatar Apr 20 '22 07:04 c-thiel

@jjoyce0510 If I can add my 2 cents here, when I first looked into this code to add support for nested groups, I immediately thought that this source could be refactored by leveraging Microsoft Python SDK for Azure AD. This SDK handles quite a few things when it comes to authentication like refresh tokens. We could also use one of the MS SDKs for Graph and completely hide the REST API calls.

All that to say that if someone tackles this issue, moving to an official SDK should be considered in my opinion instead of doing everything by hand.

cccs-eric avatar Apr 20 '22 12:04 cccs-eric

Hi @jjoyce0510, we are going to create a PR. For now, just fixing the token issue using the azure library. For the future we should also include some more parallelization.

c-thiel avatar Apr 29 '22 09:04 c-thiel

@c-thiel - Thanks for the followup. Is this still an issue?

jjoyce0510 avatar Jul 12 '22 20:07 jjoyce0510

This issue is stale because it has been open for 30 days with no activity. If you believe this is still an issue on the latest DataHub release please leave a comment with the version that you tested it with. If this is a question/discussion please head to https://slack.datahubproject.io. For feature requests please use https://feature-requests.datahubproject.io

github-actions[bot] avatar Sep 15 '22 06:09 github-actions[bot]

This issue was closed because it has been inactive for 30 days since being marked as stale.

github-actions[bot] avatar Oct 16 '22 02:10 github-actions[bot]