extensions icon indicating copy to clipboard operation
extensions copied to clipboard

Counter extension run locally: Transaction to aggregate shards failed. TypeError: Cannot read properties of undefined (reading 'length')

Open polarby opened this issue 3 years ago • 11 comments

Configuration

  • Extension name: Distributed counter
  • Extension version: 0.2.4 (19 Oct 2021)
  • Configuration values (redact info where appropriate): LOCATION="europe-west3" INTERNAL_STATE_PATH= "firebase_ext/sharded_counter" SCHEDULE_FREQUENCY= 1

Problem

This is a follow-up bug report to this question

During the Implementation of Distributed-Counter-Extension for local emulator I was able to run the scheduled function (controllerCore();) via the firebase shell and it will only work for the first aggregation. The next one will throw the following error:

Transaction to aggregate shards failed. 
TypeError: Cannot read properties of undefined (reading 'length')\n    at /Users/**/Documents/**/**-Backend/functions/internal/counter/controller.js:118:44\n . 

What I have encountered is, that when removing _firebase_ext_ from the local firestore it will work again for one aggregation again (and the directory will be created again). When running the controllerCore again after the error I will get a timeout warning of my worker function.

functions: Your function timed out after ~60s. To configure this timeout, see
      https://firebase.google.com/docs/functions/manage-functions#set_timeout_and_memory_allocation.
>  /usr/local/lib/node_modules/firebase-tools/lib/emulator/functionsEmulatorRuntime.js:618
>              throw new Error("Function timed out.");

Steps to reproduce:

Implement the Counter extension to the local emulator with the following export changes in the index file as described here:

exports.controllerCore = functions.pubsub.schedule(`every ${Config.counter.SCHEDULE_FREQUENCY} minutes`).onRun(async () => {}
exports.worker = functions.firestore.document(`${Config.counter.INTERNAL_STATE_PATH}/workers/{workerId}`).onWrite(async (change, context) => {}
exports.onWrite = functions.firestore.document(`${Config.counter.INTERNAL_STATE_PATH}/workers/{workerId}`).onWrite(async (change, context) => {}

polarby avatar Feb 05 '22 09:02 polarby

Instant temporary solution to bug report

It seems that the worker array doesn't get initialized after the first iteration.

To solve the error message and make the extension work properly replace the following line in the contoller.js file:

t.set(this.controllerDocRef, { timestamp: firebase_admin_1.firestore.FieldValue.serverTimestamp() }, { merge: true });

to

if (controllerDoc.exists) { //only update timestamp
   t.set(this.controllerDocRef, { timestamp: firebase_admin_1.firestore.FieldValue.serverTimestamp() }, { merge: true });
} else { //init worker list as well
  t.set(this.controllerDocRef, { workers: [], timestamp: firebase_admin_1.firestore.FieldValue.serverTimestamp() }, { merge: true });

}

I am not sure if I was missing something during the implementation, but it seems this error doesn't appear in the cloud?! Is this a known error?

polarby avatar Feb 05 '22 09:02 polarby

@joehan what do you think about this?

i14h avatar Feb 08 '22 17:02 i14h

Wooof, this is a tricky one. Glad to hear that you found a workaround, and AFAIK we've never seen this behavior in production. I spent some time digging through the code, and here's my theory on why you see it only in the emulator:

  • The 'workers' collection gets created within a transaction here: https://github.com/firebase/extensions/blob/9f8d7fd6048bcaa7b5bc505cebe2e90494359e33/firestore-counter/functions/src/controller.ts#L298 That transaction includes a couple reads/writes to the same collection.
  • The Firestore emulator doesn't have great support for transactions today (https://firebase.google.com/docs/emulator-suite/connect_firestore#how_the_emulator_differs_from_production), and writing many things to one document can cause lockups - this may be why you were seeing timeouts on the worker function. You may be able to work around this by increasing the timeout of the worker function. This limitation of the Firestore Emulator is actively being worked on, and hopefully you will be fixed in the coming months.

joehan avatar Feb 08 '22 20:02 joehan

Hi @polarby.

Do you still have the issue described above, it looks like the update @joehan mentioned has now been merged and perhaps deployed?

dackers86 avatar Jul 14 '22 15:07 dackers86

Give me a week for the longest, I will make it sure to notify you about the results.

polarby avatar Jul 14 '22 15:07 polarby

Hi @polarby, Thank you for your previous feedback. Is there any news on this, is it still an ongoing issue for you?

yamankatby avatar Aug 16 '22 09:08 yamankatby

Thanks for recap. Should work now

polarby avatar Aug 17 '22 05:08 polarby

Same issue at Counter extension emulator

Same error is present, when using the new extension emulator (great implementation btw. !!!) It will work well for the first Iteration then it throw the same error mentioned above.

Tested with Following specs:

"firestore-counter": "firebase/[email protected]"
"firebase-admin": "^10.3.0"
"firebase-functions": "^3.22.0",

I am not quite sure why this is happening. Can someone recheck that this is not only my device/config/etc..? Otherwise would it be wise to merge my changes ? I mean it would not be a breaking change nor would it affect the behaviour on the server?!

Example Program:

const admin = require("firebase-admin");
const db = admin.firestore();
const functions = require("firebase-functions");
const Counter = require("./counterConnector.js"); // javascript client class


exports.download = functions.https.onCall(async (data, context) => {
  const counter = new Counter(db.doc("content/12345"), "downloads");
  await counter.incrementBy(1);
  // other code...
});

The counter javascript client can be found here

polarby avatar Aug 26 '22 04:08 polarby

Hi @polarby, Now the issue only occurs on the local emulator, right?

yamankatby avatar Aug 29 '22 08:08 yamankatby

Both ways! Either running the new extension emulator (new-way) or simply running the function on cloud functions (old-way). My belief is that this happens due to the pubsub functions. Also, take note that when starting the emulator following warning appears on the counter extension (tested on new-way):

Unsupported function type on ext-firestore-counter-controllerCore. Expected either an httpsTrigger, eventTrigger, or blockingTrigger.
i  functions[europe-west3-ext-firestore-counter-controllerCore]: function ignored because the unknown emulator does not exist or is not running.

It appears that certain triggers might be unsupported resulting in them not getting triggered and not initiating the worker's array.

polarby avatar Aug 29 '22 17:08 polarby

Take a look at the pull request I have just created. This should resolve the local emulator problems. Based on my understanding this change should not interfere with the cloud behavior, but this is NOT TESTED.

There might be very well a better solution and my pull request might be just a quick fix for a more severe problem.

polarby avatar Aug 29 '22 18:08 polarby

EDIT: This seems to be a wrong report on my end (i cannot reproduce a resolvement of this issue) I don't know what has changed, but the bug has been resolved with "firebase-functions": "^4.0.2" and "firestore-counter": "firebase/[email protected]"

polarby avatar Nov 21 '22 08:11 polarby

@polarby @yamankatby I see this error two with newest firebase-tools, firebase-functions, and firebase-counter.

What is interesting is that the error only occurs on the second run of aggregateOnce after the emulator including the extension were started.

Here is an excerpt of the logs.

i  functions: Beginning execution of "controllerCore"
>  aggregateOnce first read does doc exist? false
>  {"severity":"INFO","message":"Aggregated 0 counters."}
i  functions: Finished "controllerCore" in 639.8003ms
i  functions: Beginning execution of "controllerCore"
>  aggregateOnce first read does doc exist? true
>  {"severity":"INFO","message":"Transaction to aggregate shards failed. TypeError: Cannot read properties of undefined (reading 'length')\n   

Why is it happening?

As on the first run the controllerDoc does not exist at all, controller.js falls back to a predefined worker object EMPTY_CONTROLLER_DATA.

https://github.com/firebase/extensions/blob/6656cfe040d0e4a7908700391a04184cf549678a/firestore-counter/functions/src/controller.ts#L144-L154

However, later on in aggregateOnce the controllerDoc document is updated with merge: true and the newest timestamp. Due to the controllerDoc not being stored in the database on first run, the document will be created with only timestamp property.

https://github.com/firebase/extensions/blob/6656cfe040d0e4a7908700391a04184cf549678a/firestore-counter/functions/src/controller.ts#L204-L207

On second run of aggregateOnce the controllerDoc exists and therefore the fallback is not used. Also - as mentioned before - on the second run the document only contains the timestamp property and the worker array is not present. Therefore, the aforementioned error.

https://github.com/firebase/extensions/blob/6656cfe040d0e4a7908700391a04184cf549678a/firestore-counter/functions/src/controller.ts#L144-L154

Why does the error not occur on the third run and afterwards?

On the second run, the workers array is not present in the controllerDoc document. Therefore, an error is thrown while checking the length of the array.

https://github.com/firebase/extensions/blob/6656cfe040d0e4a7908700391a04184cf549678a/firestore-counter/functions/src/controller.ts#L152-L156

This error gets caught by the surrounding catch clause ..

https://github.com/firebase/extensions/blob/6656cfe040d0e4a7908700391a04184cf549678a/firestore-counter/functions/src/controller.ts#L213-L215

.. which forces the calling function to call rescheduleWorkers ..

https://github.com/firebase/extensions/blob/6656cfe040d0e4a7908700391a04184cf549678a/firestore-counter/functions/src/index.ts#L40-L48

.. which in the end adds an workers array to the controllerDoc

https://github.com/firebase/extensions/blob/6656cfe040d0e4a7908700391a04184cf549678a/firestore-counter/functions/src/controller.ts#L307-L309

Does it also occur outside the emulator?

Yes, I was also able to find the error in my production function:

image

Do we want to fix it?

Indeed it is not a critical issue, because clever error handling prevents the issue from becoming critical. However, from a developer and customer experience it would be great if the error would not happen at all.

@polarby Would you be able to reopen the issue? Tomorrow I could provide a minimal PR.

stfsy avatar Dec 28 '22 22:12 stfsy

Thanks for the further elaboration on this ongoing bug and interesting insight into errors also in production (i wasn't aware of this). I am sorry about the lack of examination on the previous issue closure on my end.

polarby avatar Dec 29 '22 10:12 polarby

@polarby thank you for bringing it up in the first place. That put me on the right track already.

I created PR https://github.com/firebase/extensions/pull/1370 now. With this PR the controllerDoc will be created on the very first run, to make sure successive invocations can rely on a properly initialized state document.

stfsy avatar Dec 29 '22 20:12 stfsy

Closing as this seems to be fixed by the merged PR? Happy to reopen if i'm mistaken

cabljac avatar May 25 '23 08:05 cabljac