azure-functions-host
azure-functions-host copied to clipboard
Caching issue - function app staging slot keys appear in portal to change after swap slot performed
Caching issue - When a Function app with both a prod and staging slot performs a swap, the functions keys for the staging slot appear in the portal to adopt the same values as the prod keys, but in storage they do not change. (Not changing keys on swap is the expected behavior with default settings).
Investigative information
This IcM was filed
Please provide the following:
- Timestamp: 9-28-21 23:07:00 UTC
- Function App version: 3
- Function App name: SlotsSwapKey
- Function name(s) (as appropriate):
- Invocation ID:
- Region: Central US
Repro steps
Provide the steps required to reproduce the problem:
- Create a function app with 2 deployment slots.
- Go to function app's storage account (name can be found in Configuration --> AzureWebJobsStorage)
- In the storage account: containers --> azure-webjobs-secrets, then for each slot, host.json --> Edit --> set masterKey "encrypted" to false.
- Note keys for each slot. (App keys)
- Perform a slot swap in the function app, check keys for both prod and staging slot.
Expected behavior
Provide a description of the expected behavior.
After step 5, the keys should have stayed the same.
Actual behavior
Instead, the portal shows that the staging slot adopted the prod slot's keys.
Known workarounds
Setting WEBSITE_FUNCTIONS_ARMCACHE_ENABLED=0 makes the issue go away.
Related information
screenshot of portal after a slot swap:
arounds.
Any known fix to this? We use terraform to create create the initial deployment slot and az cli to perform the swap. For some very odd reason our DEV and QA environments have no problem and the production and staging slots share the same host keys, but for our prod environment, our API Management backend integration to the Function App breaks due to the key different between prod and staging slots. Super weird. WEBSITE_FUNCTIONS_ARMCACHE_ENABLED=0
did not help, unfortunately.
The original issue is most likely an ARM cache invalidation issue given brief investigation/repro from @soninaren and the original issue being mitigated by the ARM cache setting. If the setting doesn't work, a workaround could be to use the host APIs for key management (here).
@RobertPaulson90 you may be experiencing a different issue than the original if the setting isn't mitigating the problem. If you're still facing this issue, can you share a bit more about your scenario? Are you saying that swapping with DEV and QA is fine but swapping Prod and Staging slots has issues?
@satvu We are experiencing this issue as well. What are the implications of setting WEBSITE_FUNCTIONS_ARMCACHE_ENABLED=0
? We have 200+ functions apps running in production. It feels kind of wrong to be applying a preventive workaround for all functions, but it's pretty troublesome if we have to go and apply this to every single function app when it occurs.
Hi, are there any updates on this issue?
@Strandfelt We are suspecting that the ARM cache gets out of sync following a swap. That app setting mitigates the issue by circumventing the cache. Key operations may take slightly longer.
@Trimatix we do not have any updates currently, but they will be posted here as soon as they are available.