azure-webjobs-sdk icon indicating copy to clipboard operation
azure-webjobs-sdk copied to clipboard

TimerTrigger fires multiple times when different Assembly Versions of the same application are running

Open imMatt opened this issue 2 years ago • 3 comments

TimerTriggers fires twice due to single lockIDs changing if the assembly version is changed.

Repro steps

Provide the steps required to reproduce the problem

  1. Create a project, with an Azure function that is TimerTriggered and specify a version ID for the assembly ie. 1.0.0.0
  2. Run this project
  3. Update the assembly version ie. to 1.0.0.1
  4. Run the updated project without stopping the old version
  5. TimerTrigger will fire twice, due to both version successfully acquiring the storage lock (note that the LockID has changed between deplyoments)

Expected behavior

TimerTrigger will only fire on a single version.

Actual behavior

TimerTrigger will fire on both versions due to the singleton lock ID being different between version,

Known workarounds

Setting Assembly Version to a constant value.

Related information

Running on latest (3.0.33) version of azure-webjobs-sdk

From my analysis, this appears caused by the implementation of DefaultHostIdProvider.cs Hostid is calculated based on the hash of DeclaringType.Assembly.FullName ie. "Examples.Jobs.Project, Version=1.0.0.0, Culture=neutral, PublicKeyToken=null"

This hostID is used by TimerTriggers underlying Singleton lock.

It appears that an assumption has been made that this value will be constant, however when the version number is set this assumption isn't true.

In our case we're using nerdbank.gitversioning, which means for every build we make the hostID changes and hence a new singleton lock is generated. This becomes an issue for us as we're running in K8s across regions, where we bring up pods of the new version, before the old versions complete termination.

Hence it is possible that while deploying, if a timer trigger fires that both the old and new versions will execute despite 'TimerTrigger' being a singleton as the old and new version will used different storage locks.

Sample Logs:

Note, same function has been invoked twice for the same trigger. Differing LockID has been acquired due to the version change

Version 0.2.556.23330

02/08/2022, 10:42:29.698 pm (Local time) Singleton lock acquired (79483a8b9f897f831f00e42bac341a5d/AssemblyNameHere.Functions.FunctionNameHere.Run.Listener)

Invoked at 08/02/2022 12:43:00, timer details: "Cron: '0 * * * * *'", "{ \"Last\" : ISODate(\"0001-01-01T00:00:00Z\"), \"Next\" : ISODate(\"2022-08-02T12:43:00Z\"), \"LastUpdated\" : ISODate(\"0001-01-01T00:00:00Z\") }"

Hash ID based off: AssemblyNameHere, Version=0.2.556.23330, Culture=neutral, PublicKeyToken=null


Version 0.2.557.14383

02/08/2022, 10:42:06.848 pm (Local time) Singleton lock acquired (37100d1ba5ba9dc196318a31b433b6a3/AssemblyNameHere.Functions.FunctionNameHere.Run.Listener)

Invoked at 08/02/2022 12:43:00, timer details: "Cron: '0 * * * * *'", "{ \"Last\" : ISODate(\"0001-01-01T00:00:00Z\"), \"Next\" : ISODate(\"2022-08-02T12:43:00Z\"), \"LastUpdated\" : ISODate(\"0001-01-01T00:00:00Z\") }"

Hash ID based off: AssemblyNameHere, Version=0.2.557.14383, Culture=neutral, PublicKeyToken=null

Note: real project name has been redacted, hence why the hashes wont match if you hash that value.

imMatt avatar Aug 05 '22 02:08 imMatt

I'm happy to work on a solution for this defect, would like some guidance from your side on the optimal approach though. An environment variable to override this value similar to how the azure-function-host does should be able to resolve this without a breaking change imo.

imMatt avatar Aug 05 '22 02:08 imMatt

Hi @imMatt Thank you for the feedback, We will investigate this issue further and let you know about the findings soon!

Ved2806 avatar Aug 11 '22 12:08 Ved2806

Is there a fix for this? We are experiencing the same problem. Whenever we deploy a new version of our webjob, a different host id is generated. As we deploy to the staging slot, and the production slot has a different host id, the TimeTrigger fires at both slots, causing undesirable behavior

pilarodriguez avatar May 29 '23 12:05 pilarodriguez