traction icon indicating copy to clipboard operation
traction copied to clipboard

A Tenant can seem to be getting a DIFFERENT tenant's TAA acceptance

Open loneil opened this issue 1 year ago • 4 comments

I think I’ve found an issue with TAA acceptance in the Multi-Tenant case where a tenant that has NOT accepted the TAA for a ledger can seem to get a response from their TAA endpoint that says they HAVE accepted the TAA.

And I think this might have something to do with in-memory ACA-Py rather than the database persistence or any actual ledger write.

I don’t think this is any Tenant UI display issue.

I can reproduce with 2 tenants (both using sovrin-testnet) in this case as below

Step 1: Log in with a Tenant that has not accepted TAA.

image

Tenant 1 (created Dec 1) has not accepted the TAA for sovrin test.

Step 2: Log in with a Tenant that has accepted the TAA

image

Tenant 2 HAS accepted the TAA, calls the /taa endpoint and gets that

Step 3: Go back to Tenant 1 and refresh (call the /taa endpoint again)

image

Now Tenant 1 thinks it’s accepted the TAA on the same date as Tenant 2!

Has an Oct 23 acceptance date even though the Tenant did not exist until Dec 1...

Step 5: Restart ACA-Py (only tested this on OCP, by killing pods, haven’t tried locally or anything)

image

Now Tenant 1 is back to knowing it has not accepted the TAA


Confirmed it's not specific to Tenant UI, and is "pod related" in the multi-pod scenario in dev

Using one Tenant's token in a single swagger instance I get 2 different TAA results randomly while it bounces between pods

image

image

loneil avatar Dec 11 '23 23:12 loneil

I checked the code in ACA-Py, it appears that the acceptance metod is being cached at the indy-vdr global level rather than at the tenant profile level (see here when the result is set, and here when it is fetched). This would cause the acceptance method to appear "shared" for all tenants for the duration of the cache (10 min by default), with the latest accepted method set/fetched being returned until the cache misses.

@andrewwhitehead does the analysis above make sense? Any recommendations to make other than saving the TAA acceptance method in the tenant's profile rather than in a shared cache?

esune avatar Dec 12 '23 00:12 esune

Will try this out with a nightly build including the change from https://github.com/hyperledger/aries-cloudagent-python/pull/2676 sometime soon to confirm Traction behaviour.

loneil avatar Dec 20 '23 20:12 loneil

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days.

github-actions[bot] avatar Jan 20 '24 02:01 github-actions[bot]

I can still reproduce this in the 0.12.0rc2 setup running here https://pr-1025-traction-tenant-ui-dev.apps.silver.devops.gov.bc.ca/

Can see with the following steps

Log in with "Lucas Sovrin 2" This tenant has not accepted the TAA 18b10d1e-cc33-4ebd-a47f-b58269d1af4a a1c54a03-04bd-4a84-ba68-0531fbd2706e

Go to the profile page to see TAA status, or get the token and call the TAA endpoint. Can confirm it returns no TAA acceptance

Refresh, try API call multiple times, still see TAA unaccepted. (PR runs on 1 pod so only hitting a single cache anyways)

Log in with Log in with "Lucas Sovrin 1" This tenant has accepted the TAA 30d67ebf-a775-4132-8995-e1b049addca8 57f8bf4a-8082-4e5c-8f91-fd9652cdb9d2

Can see TAA accepted with time 1709769600

Now go back and refresh or call the API again for Tenant 2 (no TAA) and it will claim it has accepted the TAA now with the same time 1709769600.

Wait appropriate cache timeout time or restart ACA-Py and go look at Tenant 2 again and it's status will be correct again, until the API call is done for Tenant 1 again.

loneil avatar Mar 08 '24 00:03 loneil