drift icon indicating copy to clipboard operation
drift copied to clipboard

Potential issue with isolates

Open jbxbergdev opened this issue 5 months ago • 6 comments

We're using Drift with a background isolate (using flutter_foreground_task). We're using the setup recommended in the Drift docs to set up the drift isolate with spawn and lookupByPortName. It works fine most of the time. However, intermittently, we have really weird crashes in the background isolate due to Drift attempting to inialize with the DatabaseOpener that we configured for the main isolate (in the spawnmethod).

Our setup in more detail:

// We use dependency injection to inject the appropriately configured DriftHolder instance depending on the isolate: 
// In the main isolate, DriftHolder will be initialized with createIsolateWithSpawn, in the background isolate, 
// createIsolateFromPort will be passed to the constructor.

class DriftHolderImpl implements DriftHolder {

  static final _driftConnectPortName = 'DRIFT_CONNECT_PORT';
  OurDatabase? _db;
  final Future<DriftIsolate> Function() provideDriftIsolate;
  final BehaviorSubject<bool> _provideDbLock = BehaviorSubject.seeded(false);

  DriftHolderImpl({required this.provideDriftIsolate});

  @override
  Future<OurDatabase> provideDatabase() async {
    await _provideDbLock.firstWhere((locked) => !locked,);
    _provideDbLock.add(true);
    if (_db != null) {
      _provideDbLock.add(false);
      return _db!;
    }
    final isolate = await provideDriftIsolate();
    _db = OurDatabase(await isolate.connect());
    _provideDbLock.add(false);
    return _db!;
  }

  static Future<DriftIsolate> createIsolateWithSpawn() async {
    final token = RootIsolateToken.instance;
    final isolate = await DriftIsolate.spawn(() {
      if (token != null) {
        BackgroundIsolateBinaryMessenger.ensureInitialized(token);
      }
      return LazyDatabase(() async {
        final dbFolder = await getApplicationDocumentsDirectory();
        final path = join(dbFolder.path, 'ourdb.sqlite');
        if (Platform.isAndroid) {
          await applyWorkaroundToOpenSqlite3OnOldAndroidVersions();
        }
        final cachebase = (await getTemporaryDirectory()).path;
        sqlite3.tempDirectory = cachebase;
        return NativeDatabase(File(path));
      });
    });
    IsolateNameServer.registerPortWithName(isolate.connectPort, DriftHolderImpl._driftConnectPortName);
    return isolate;
  }

  static Future<DriftIsolate> createIsolateFromPort() async {
    final port = IsolateNameServer.lookupPortByName(DriftHolderImpl._driftConnectPortName);
    return DriftIsolate.fromConnectPort(port!);
  }

}

}

Now, when we use the database from the background isolate, sometimes we get this error:

[ERROR:flutter/runtime/dart_vm_initializer.cc(40)] Unhandled Exception: Exception: No platform channel handler registered for background isolate.
dart:ui/platform_dispatcher.dart 678                                              PlatformDispatcher.sendPortPlatformMessage
package:flutter/src/services/_background_isolate_binary_messenger_io.dart 85      BackgroundIsolateBinaryMessenger.send
package:flutter/src/services/platform_channel.dart 244                            BasicMessageChannel.send
package:path_provider_android/messages.g.dart 84                                  PathProviderApi.getApplicationDocumentsPath
package:path_provider_android/path_provider_android.dart 62                       PathProviderAndroid.getApplicationDocumentsPath
package:path_provider/path_provider.dart 121                                      getApplicationDocumentsDirectory
**package:ourpackage/drift_holder.dart 50                                           DriftHolderImpl.createIsolateWithSpawn.<fn>.<fn>**
dart:async/future.dart 315                                                        new Future.sync
package:drift/src/utils/lazy_database.dart 47                                     LazyDatabase._awaitOpened
package:drift/src/utils/lazy_database.dart 64                                     LazyDatabase.ensureOpen
package:drift/src/remote/server_impl.dart 141                                     ServerImplementation._handleEnsureOpen
package:drift/src/remote/communication.dart 165                                   DriftCommunication.setRequestHandler.<fn>
===== asynchronous gap ===========================
package:drift/src/remote/communication.dart 113                                   DriftCommunication.request
package:drift/src/remote/client_impl.dart 173                                     _RemoteQueryExecutor.ensureOpen
package:drift/src/runtime/api/connection_user.dart 169                            DatabaseConnectionUser.doWhenOpened
package:drift/src/runtime/executor/stream_queries.dart 326                        QueryStream.fetchAndEmitData
package:drift/src/runtime/executor/stream_queries.dart 258                        QueryStream._onListenOrResume
package:drift/src/runtime/executor/stream_queries.dart 209                        QueryStream._stream.<fn>
dart:async/stream_impl.dart 1177                                                  _MultiStream.listen.<fn>
dart:async/stream_controller.dart 838                                             _runGuarded
dart:async/stream_controller.dart 716                                             _StreamController._subscribe.<fn>
dart:async/stream_impl.dart 457                                                   _BufferingStreamSubscription._guardCallback
dart:async/stream_controller.dart 715                                             _StreamController._subscribe
dart:async/stream_impl.dart 1179                                                  _MultiStream.listen
package:drift/src/utils/async_map.dart 30                                         AsyncMapPerSubscription.asyncMapPerSubscription.<fn>
dart:async/stream_impl.dart 1177                                                  _MultiStream.listen.<fn>
dart:async/stream_controller.dart 838                                             _runGuarded
dart:async/stream_controller.dart 716                                             _StreamController._subscribe.<fn>
dart:async/stream_impl.dart 457                                                   _BufferingStreamSubscription._guardCallback
dart:async/stream_controller.dart 715                                             _StreamController._subscribe
dart:async/stream_impl.dart 1179                                                  _MultiStream.listen
dart:async/stream_pipe.dart 141                                                   new _ForwardingStreamSubscription
dart:async/stream_pipe.dart 373                                                   new _StateStreamSubscription
dart:async/stream_pipe.dart 500                                                   _DistinctStream._createSubscription
dart:async/stream_pipe.dart 96                                                    _ForwardingStream.listen
dart:async/stream_pipe.dart 141                                                   new _ForwardingStreamSubscription
dart:async/stream_pipe.dart 105                                                   _ForwardingStream._createSubscription
dart:async/stream_pipe.dart 96                                                    _ForwardingStream.listen
dart:async/stream_controller.dart 926                                             new _AddStreamState
dart:async/stream_controller.dart 981                                             new _StreamControllerAddStreamState
dart:async/stream_controller.dart 594                                             _StreamController.addStream
dart:async-patch/async_patch.dart 102                                             _AsyncStarStreamController.addStream
**package:ourpackage/our_class.dart                                               OurClass.methodThatIsCalledOnlyInBackgroundIsolate**

This call stack shouldn't exist. :) methodThatIsCalledOnlyInBackgroundIsolate causes a call to the DatabaseOpener defined in the foreground initialization method createIsolateWithSpawn.

Things to note:

  1. Problem seems to be specific to Android.
  2. We restart the background service (and with it kill/re-spawn our background isolate) quite often. When the issue occurs, in a number of cases, the behavior to persists when the background isolate is killed and re-spawned, but sometimes it works fine then.
  3. We're experiencing this mostly with live users and have so far only been able to reproduce the error once with our devices. We still use Drift 2.21.0 (infrequent updates due to lenghty release process). Do the newer versions have any changes that may affect this behavior?

jbxbergdev avatar Jun 18 '25 22:06 jbxbergdev

await _provideDbLock.firstWhere((locked) => !locked,); _provideDbLock.add(true);

This is an odd mutex implementation - I'm not sure if it's related to this, but I don't think it's fully sound. E.g. if the method is called two times when unlocked, it's possible for both invocations to wait for the firstWhere and for both of them to get continued.

We still use Drift 2.21.0 (infrequent updates due to lenghty release process). Do the newer versions have any changes that may affect this behavior?

No, I don't remember anything there that would have an impact on this. And just to be sure, this is the only place where you're opening a drift database? There really is no way provideDatabase could be called on a background isolate itself?

simolus3 avatar Jun 19 '25 19:06 simolus3

@simolus3 I added this lock mechanism after getting some error message about double-initialized database. This actually fixed it. But thinking about it, there might be better ways to do it. Thanks for the heads-up!

The way our dependency injection is set up, it' safe to say there's no accidental instantiation with the wrong method providing the DriftIsolate providing method. There are completely separate DI implementations for the main and background isolates.

I just went over the documentation again and now am a bit confused: The article on isolates basically recommends the setup as used in my createIsolateWithSpawn method. I just learned the callback passed to DriftIsolate.spawn (which I think is the method showing up in my stack trace) is executed in the Drift worker isolate. I missed that part previously. What gives me some confusion: The API doc on DriftIsolate.spawn specificlly asks to pass the callback as a top-level/static function (which is the way I learned passing callbacks to isolates and am frankly a bit surprised that this also works with non-static inline functions :) ). Given this approach: What happens if the isolate that createIsolateWithSpawn gets called from is terminated? What would happen to the token variable? Could this be my problem (the RootIsolateToken being unavailable after the UI got killed)?

jbxbergdev avatar Jun 19 '25 19:06 jbxbergdev

Not having completely understood the cause of the problem aside, I think using the workaround for old Flutter versions would fix at least the crash, as the solution passes the documents path to the worker isolate via a SendPort vs calling getApplicationDocumentsDirectory within the isolate.

@simolus3 would it make sense to add a parameter to the callback passed to DriftIsolate.spawn that callers from the main isolate can use to pass the documents path?

jbxbergdev avatar Jun 20 '25 05:06 jbxbergdev

The API doc on DriftIsolate.spawn specificlly asks to pass the callback as a top-level/static function (which is the way I learned passing callbacks to isolates and am frankly a bit surprised that this also works with non-static inline functions :)

That note is outdated, Dart supports passing closures between isolates too (as long as the variables closed over can be sent to the background isolate as well). I'll remove it.

What happens if the isolate that createIsolateWithSpawn gets called from is terminated?

The spawned drift isolate would continue to run. Do you think this is an edge case where:

  1. The main isolate spawns a drift isolate, and sends the token?
  2. The main isolate stops.
  3. The background isolate tries to issue a platform channel request?

I'm actually not sure what happens then, but generally isolates don't know if a SendPort is dangling so I don't think this would cause a No platform channel handler registered for background isolate error - my best guess is that platform channels would likely just time out.

(the token variable doesn't magically become null though - if it's sent to the isolate that one will have its own copy)

would it make sense to add a parameter to the callback passed to DriftIsolate.spawn that callers from the main isolate can use to pass the documents path?

I feel like that's a workaround for a problem that shouldn't exist, so let's try to figure this one out :) That said, since you can pass closures, you can call getApplicationDocumentsDirectory(), store the result in a local variable and reference that in the function you pass to DriftIsolate.spawn.

simolus3 avatar Jun 20 '25 09:06 simolus3

Dart supports passing closures between isolates too

That's interesting. I just read the part on that in the Dart documentation on isolates. There's no mention of what happens if the closure accesses fields from outside. So maybe there's a problem, that this may or may not work, given different circumstances?

(the token variable doesn't magically become null though - if it's sent to the isolate that one will have its own copy)

I actually tried commenting out the call to BackgroundIsolateBinaryMessenger.ensureInitialized(token) - That resulted in a different error than the one our users experience.

Do you think this is an edge case where:

The main isolate spawns a drift isolate, and sends the token?
The main isolate stops.
The background isolate tries to issue a platform channel request?

Yes, but that's a wild guess on my part, for a lack of better explanations. The logs we have from users actually don't indicate the UI being killed. It's really a randomly occurring issue.

Another guess: I've seen the stack traces, which include both the calling isolate and the worker isolate stacks is "assembled" along the way. Possibly there's an issue with that and the call stack in reality is a different one?

I feel like that's a workaround for a problem that shouldn't exist, so let's try to figure this one out :)

Agreed. Whatever causes this behavior might lead to other problems down the line? But I think for the time being, I will try passing the path string instead of the token and see what happens. Unfortunately, after the one time I reproduced the behavior, I was never again able to. So can't really verify anything.

jbxbergdev avatar Jun 20 '25 09:06 jbxbergdev

So maybe there's a problem, that this may or may not work, given different circumstances?

My understanding is that as long as the two isolates share a Flutter engine (which they do because you could otherwise only send primitive types), the kinds of messages they're allowed to exchange is consistent. So if it works on your device, it should work everywhere else too.

Variables referenced by the closure will be sent to the other isolate, so as long as these values can be sent there's no problem. You need to be a little careful with fields because the variable actually sent in that case is this, so there's a possibility that you capture more than you intended to.

I actually tried commenting out the call to BackgroundIsolateBinaryMessenger.ensureInitialized(token)

Very interesting, could you share that error message as well?

Yes, but that's a wild guess on my part, for a lack of better explanations. The logs we have from users actually don't indicate the UI being killed. It's really a randomly occurring issue.

And I also realize now that the database would only be initialized on the background isolate in response to the main isolate sending a request, right? So I'm not convinced that's it.

simolus3 avatar Jun 20 '25 14:06 simolus3