flutterfire icon indicating copy to clipboard operation
flutterfire copied to clipboard

[cloud_functions]: Getting DEADLINE_EXCEEDED but no trace of the call on the backend

Open orestesgaolin opened this issue 1 year ago • 12 comments

Is there an existing issue for this?

  • [X] I have searched the existing issues.

Which plugins are affected?

Cloud Functions

Which platforms are affected?

Android

Description

We are happy users of cloud_functions SDK but sometimes it happens that the library just stops executing calls to callable functions, even though:

  • it was working just fine few minutes before
  • Firebase Auth token is valid
  • other users can access the function just fine, even from the same network

The error thrown is just DEADLINE_EXCEEDED after the specified time

I/flutter (15750): [CloudFunctionsClient] [firebase_functions/deadline-exceeded] DEADLINE_EXCEEDED
I/flutter (15750): [CloudFunctionsClient] 
I/flutter (15750): [CloudFunctionsClient] #0      StandardMethodCodec.decodeEnvelope (package:flutter/src/services/message_codecs.dart:648:7)
I/flutter (15750): [CloudFunctionsClient] #1      MethodChannel._invokeMethod (package:flutter/src/services/platform_channel.dart:334:18)
I/flutter (15750): [CloudFunctionsClient] <asynchronous suspension>
I/flutter (15750): [CloudFunctionsClient] #2      MethodChannelHttpsCallable.call (package:cloud_functions_platform_interface/src/method_channel/method_channel_https_callable.dart:22:24)
I/flutter (15750): [CloudFunctionsClient] <asynchronous suspension>
I/flutter (15750): [CloudFunctionsClient] #3      HttpsCallable.call (package:cloud_functions/src/https_callable.dart:49:37)
I/flutter (15750): [CloudFunctionsClient] <asynchronous suspension>
I/flutter (15750): [CloudFunctionsClient] #4      CloudFunctionsClient.appLaunch (package:cloud_functions_client/src/cloud_functions_client.dart:775:22)
I/flutter (15750): [CloudFunctionsClient] <asynchronous suspension>
...
I/flutter (15750): [CloudFunctionsClient] 
I/flutter (15750): [CloudFunctionsClient] #0   StandardMethodCodec.decodeEnvelope (package:flutter/src/services/message_codecs.dart:648:7)
I/flutter (15750): [CloudFunctionsClient] #1   MethodChannel._invokeMethod (package:flutter/src/services/platform_channel.dart:334:18)
I/flutter (15750): [CloudFunctionsClient] #2   <asynchronous suspension>
I/flutter (15750): [CloudFunctionsClient] #3   MethodChannelHttpsCallable.call (package:cloud_functions_platform_interface/src/method_channel/method_channel_https_callable.dart:22:24)
I/flutter (15750): [CloudFunctionsClient] #4   <asynchronous suspension>
I/flutter (15750): [CloudFunctionsClient] #5   HttpsCallable.call (package:cloud_functions/src/https_callable.dart:49:37)
I/flutter (15750): [CloudFunctionsClient] #6   <asynchronous suspension>
I/flutter (15750): [CloudFunctionsClient] #7   CloudFunctionsClient.appLaunch (package:cloud_functions_client/src/cloud_functions_client.dart:775:22)
I/flutter (15750): [CloudFunctionsClient] â›” appLaunch failed

Reproducing the issue

It's quite tricky to reproduce the issue, it may happen during debugging, but when it occurs it's just difficult to capture any debug traces outside of the above exception.

My implementation typically is quite simple

      final user = firebaseAuth.currentUser;
      if (user == null) {
        throw Exception('User is not authenticated');
      }

      // check if token is valid
      final tokenResult = await user.getIdTokenResult();
      if (tokenResult.token == null) {
        throw Exception('User token is not valid');
      }

      final result = await functions
          .httpsCallable(
        'nameOfTheFunction',
        options: HttpsCallableOptions(timeout: const Duration(seconds: 30)),
      )
          .call<Map<String, dynamic>>(
        {
          'appBuild': appBuild,
          'userAgent': userAgent,
        },
      );

Firebase Core version

2.31.1

Flutter Version

3.24.4

Relevant Log Output

No response

Flutter dependencies

Expand Flutter dependencies snippet

Dart SDK 3.5.4
Flutter SDK 3.24.4
visible_app 24.11.3

dependencies:

- cloud_functions_client 1.0.0+1 [cloud_functions equatable firebase_auth flutter json_annotation log watcher]
- cloud_storage_client 1.0.0+1 [equatable file firebase_auth firebase_storage flutter log path rxdart]
- firebase_auth 4.19.6 [firebase_auth_platform_interface firebase_auth_web firebase_core firebase_core_platform_interface flutter meta]
- firebase_core 2.31.1 [firebase_core_platform_interface firebase_core_web flutter meta]
- firebase_crashlytics 3.4.18 [firebase_core firebase_core_platform_interface firebase_crashlytics_platform_interface flutter stack_trace]
- firebase_messaging 14.9.0 [firebase_core firebase_core_platform_interface firebase_messaging_platform_interface firebase_messaging_web flutter meta]
- flutter 0.0.0 [characters collection material_color_utilities meta vector_math sky_engine]


transitive dependencies:

- cloud_functions 4.7.5 [cloud_functions_platform_interface cloud_functions_web firebase_core firebase_core_platform_interface flutter]
- cloud_functions_platform_interface 5.5.27 [firebase_core flutter meta plugin_platform_interface]
- cloud_functions_web 4.9.5 [cloud_functions_platform_interface firebase_core firebase_core_web flutter flutter_web_plugins]
- firebase_auth_platform_interface 7.2.7 [_flutterfire_internals collection firebase_core flutter meta plugin_platform_interface]
- firebase_auth_web 5.11.6 [firebase_auth_platform_interface firebase_core firebase_core_web flutter flutter_web_plugins http_parser meta web]
- firebase_core_platform_interface 5.0.0 [collection flutter flutter_test meta plugin_platform_interface]
- firebase_core_web 2.17.0 [firebase_core_platform_interface flutter flutter_web_plugins meta web]
- firebase_crashlytics_platform_interface 3.6.25 [_flutterfire_internals collection firebase_core flutter meta plugin_platform_interface]
- firebase_dynamic_links 5.4.17 [firebase_core firebase_core_platform_interface firebase_dynamic_links_platform_interface flutter meta plugin_platform_interface]
- firebase_dynamic_links_platform_interface 0.2.6+25 [_flutterfire_internals firebase_core flutter meta plugin_platform_interface]
- firebase_messaging_platform_interface 4.5.33 [_flutterfire_internals firebase_core flutter meta plugin_platform_interface]
- firebase_messaging_web 3.8.3 [_flutterfire_internals firebase_core firebase_core_web firebase_messaging_platform_interface flutter flutter_web_plugins meta web]
- firebase_storage 11.7.6 [firebase_core firebase_core_platform_interface firebase_storage_platform_interface firebase_storage_web flutter]
- firebase_storage_platform_interface 5.1.21 [_flutterfire_internals collection firebase_core flutter meta plugin_platform_interface]
- firebase_storage_web 3.9.6 [_flutterfire_internals async firebase_core firebase_core_web firebase_storage_platform_interface flutter flutter_web_plugins http meta web]
- http_multi_server 3.2.1 [async]
- http_parser 4.0.2 [collection source_span string_scanner typed_data]
- http_profile 0.1.0

Additional context and comments

I know that my firebase sdk is quite old compared to the current version, but maybe something comes to your mind that could potentially explain the issue? Do you recall any fixes between 2.31.1 and current version that could resolve it? Migrating to the current version would be a bit inconvenient for us right now.

orestesgaolin avatar Nov 21 '24 09:11 orestesgaolin

Hi @orestesgaolin, thanks for the report. I wasn't able to reproduce the issue on my end. It might be related to a timeout, so increasing the timeout could be a potential solution, though I'm not entirely sure. You might also find it helpful to check out this Reddit thread, as some of the suggested solutions there might address your issue.

SelaseKay avatar Nov 21 '24 13:11 SelaseKay

The person in the reddit thread tries to explain it by their MacBook network issues, and they may be onto something. But in my case:

  • I can see this issue appearing among our users, every day there will be few dozen who cannot access something via the cloud functions
  • it affects both V1 and V2 callable functions
  • it starts after a while, restarting the app typically solves it

So my gut feeling is that it may be related to some internal cache in the context of the application. Whether it's DNS or auth session I think it's common enough that probably some other people have it regularly. In the last 7 days I can see 164 unique users having this issue on Android and 134 on iOS.

orestesgaolin avatar Nov 21 '24 14:11 orestesgaolin

I don't think I'm able to provide more repro steps but today it happened again. I can see all my colleagues are able to call the endpoint but from my Android device I see the same exception.

I'm using now 5.2.0 of the cloud_functions package as opposed to 4.7.5 last time.

The issue does not occur on iOS and another Samsung device on the same network.

The Firebase Auth token gets refreshed normally.

Restarting the app or network connection does not help here.

Restarting the phone didn't help too.

Issue occurs both in debug and release mode.

I also noticed following error message when trying to access DevTools

Log message:

A Dart VM Service on SM G991B is available at: http://127.0.0.1:49445/tMJnW2EYSmU=/

and clicking on the link leads to:

No DevTools instance is registered with the Dart Development Service (DDS).

orestesgaolin avatar Jan 31 '25 14:01 orestesgaolin

What is quite annoying this issue appears in production to dozens of users as well so it's not only dev-ex issue

Image

orestesgaolin avatar Jan 31 '25 15:01 orestesgaolin

Hey @orestesgaolin - a few thoughts on this.

We have tried to match the exceptions on iOS and web (good argument against trying to unify platform exceptions) which unfortunately might be obfuscating the true nature of the exception: https://github.com/firebase/flutterfire/blob/main/packages/cloud_functions/cloud_functions/android/src/main/java/io/flutter/plugins/firebase/functions/FlutterFirebaseFunctionsPlugin.java#L157-L167

So you might be getting "cancelled" or "timeout" but it is coming back as "deadline exceeded".

The next thing; what exactly are you doing in the function? Are you doing firestore writes for example? You could be hitting a quota: https://github.com/firebase/firebase-android-sdk/issues/744#issuecomment-579293997

Last thing; the nuclear option is to point your cloud_functions to a local src, and update the android code I've linked and use crashlytics to log the crash. At least you get the untampered native crash report which might be more illuminating to your issue.

russellwheatley avatar Jan 31 '25 16:01 russellwheatley

what exactly are you doing in the function?

This one is quite simple, it just reads data from Firestore and pings a 3rd party API. I don't see any sign of the call being received on the backend i.e. not even a single log. About the same time I can easily see other users calling this endpoint with no code or dependency version differences.

the nuclear option is to point your cloud_functions to a local src,

That's a really good idea, I will try whenever it happens for the next time

orestesgaolin avatar Feb 01 '25 12:02 orestesgaolin

Just another example of that today, and strangely it was enough to switch from wi-fi to cellular network. Calling the function from the same network afterwards (i.e. wifi -> cellular -> wifi) fixed the issue, no more DEADLINE_EXCEEDED.

orestesgaolin avatar Mar 18 '25 14:03 orestesgaolin

Hey there @orestesgaolin would it be appropiate to mark this as resolved or are there still any more questions?

MichaelVerdon avatar Mar 20 '25 16:03 MichaelVerdon

I'm not sure. Seems like it happens sometimes but unless someone on the Firebase or Flutterfire team can reproduce it, I don't think we are able to resolve it

orestesgaolin avatar Mar 20 '25 17:03 orestesgaolin

Hi there, I can only really recommend trying to upgrade the SDK unless any more information can be provided despite the inconvenience that could be caused from it. What are the implications of upgrading the SDK for you? Maybe we can work through it?

MichaelVerdon avatar Mar 21 '25 14:03 MichaelVerdon

We're at cloud_functions 5.2.0 now, but will try to update to 5.3.4 soon. It shouldn't be that much of a problem, I just need to find time ;)

orestesgaolin avatar Mar 21 '25 14:03 orestesgaolin

Hi there, @orestesgaolin sounds great. I will keep this ticket open in the meantime and let me know how this goes for you! 😊

MichaelVerdon avatar Mar 21 '25 15:03 MichaelVerdon

Hey there @orestesgaolin is the issue persisting for you after an upgrade?

MichaelVerdon avatar Jul 03 '25 13:07 MichaelVerdon

Hey @orestesgaolin. We need more information to resolve this issue but there hasn't been an update in 7 weekdays. I'm marking the issue as stale and if there are no new updates in the next 7 days I will close it automatically.

If you have more information that will help us get to the bottom of this, just add a comment!

google-oss-bot avatar Jul 17 '25 01:07 google-oss-bot

The issue is still present on the SDK 5.2.0. We're aiming to update to the latest and validate if this has been fixed "by the way".

We're seeing this in production, so it's not only debug mode problem

Image

orestesgaolin avatar Jul 17 '25 07:07 orestesgaolin

The issue is still present on the SDK 5.2.0. We're aiming to update to the latest and validate if this has been fixed "by the way".

We're seeing this in production, so it's not only debug mode problem

Image

Do you think you could provide me a minimal repo to investigate this issue? I have not been able to reproduce so far and a minimal repo could go very far, especially if the 'actual error' is being obfuscated.

MichaelVerdon avatar Jul 18 '25 12:07 MichaelVerdon

I don't think I can provide reproducible example. Our calls are extremely simple e.g.

  Future<List<Map<String, dynamic>>> getAvailableFirmwareUpdates() async {
    try {
      final result = await functions.httpsCallable('getAvailableFirmwareUpdates').call<Map<String, dynamic>>(
        {
          'appBuild': appBuild,
          'userAgent': userAgent,
        },
      );
      if (result.data['status'] == 'success') {
        logger.d('getAvailableFirmwareUpdates cloud functions success');
        final list = result.data['response'] as List;
        return [
          for (final item in list) Map<String, dynamic>.from(item as Map),
        ];
      } else {
        logger.w('getAvailableFirmwareUpdates cloud functions failed: ${result.data}');
        return [];
      }
    } catch (e) {
      logger.e('getAvailableFirmwareUpdates failed', e);
      return [];
    }
  }

The issue appears quasi-randomly and I just cannot see a pattern here. Phone has decent internet connection, can access Google services, moreover some cloud functions may still work and only few are getting this DEADLINE_EXCEEDED error. If this was only happening in debug mode, that wouldn't be that big problem, but we see it across iOS and Android in production around the world.

E.g. this is one of the many exception reports in Crashlytics for iOS for the above invocation.

Image

What is also interesting this happens only to callable functions, but the functions are both v1 (Firebase) and v2 (Cloud Run Functions).

orestesgaolin avatar Jul 18 '25 13:07 orestesgaolin

We are seeing a similar issue, but I am not sure if it is exactly the same, and we only noticed this on android.

Our scenario is step by step:

  1. call a httpsCallable function -> works
  2. enable VPN through the app
  3. call a httpsCallable function -> stops working
  4. call a httpsCallable function 5 minutes later -> works again

From this we assume that this happens because okhttp is used in the background, and the connection pool by default is 5 minutes and after the old connection gets released it starts working again.

But we were unable to find any way to release the connection pool on the connection change since it is hidden in the code. The plugin does not provide an instance of okhttp client that would enable to reset it.

If there is a hidden option to release it, please let us know.

PS: Our workaround for now is to wrap it into HTTP requests and not go through the SDK

vrecer avatar Oct 22 '25 12:10 vrecer