drift
drift copied to clipboard
UI still freezes while using Isolates
Hello,
I am using Moor but I noticed that during a query the UI freezes. For this reason I switched to using Isolates, but the UI is still freezing and I don't know what I am doing wrong. I am using the code from the docs but adapted to use an existing database from the filesystem.
I fleshed out my project a bit to have a simpler example to show the problem. One the FAB is clicked, I start showing a CircularProgressIndicator. This is spinning initially, but when the query is sent it freezes until the data is loaded...
Any help would be very appreciated! If needed I can make this whole test project available.
Thanks
main
import 'dart:io';
import 'dart:isolate';
import 'package:flutter/material.dart';
import 'package:flutter/services.dart';
import 'package:moor/ffi.dart';
import 'package:moor/isolate.dart';
import 'package:moor/moor.dart';
import 'package:path/path.dart';
import 'package:path_provider/path_provider.dart';
import 'infrastructure/database/flights_dao.dart';
import 'infrastructure/database/moor_database.dart';
void main() async {
WidgetsFlutterBinding.ensureInitialized();
await _initDb();
runApp(
MaterialApp(
title: 'Lazy Logbook',
home: HomeWidget(),
),
);
}
class HomeWidget extends StatefulWidget {
const HomeWidget({
Key key,
}) : super(key: key);
@override
_HomeWidgetState createState() => _HomeWidgetState();
}
class _HomeWidgetState extends State<HomeWidget> {
bool loading = false;
List<FlightComposed> flights = [];
@override
Widget build(BuildContext context) {
return Scaffold(
floatingActionButton: FloatingActionButton(
onPressed: () async {
setState(() {
loading = true;
});
MoorIsolate isolate = await _createMoorIsolate();
var tempflights = await FlightsDao(MoorDatabase.connect(
DatabaseConnection.delayed(
isolate.connect(isolateDebugLog: true))))
.getAllFlightsComposed();
setState(() {
flights = tempflights;
loading = false;
});
isolate.shutdownAll();
},
),
appBar: AppBar(
title: Text('test'),
),
body: loading
? CircularProgressIndicator()
: ListView.builder(
itemCount: flights.length,
itemBuilder: (context, index) {
return ListTile(
title: Text(flights[index].flight.flightCode.toString()),
);
}));
}
}
Future<void> _initDb() async {
final dbFolder = await getApplicationDocumentsDirectory();
var path = join(dbFolder.path, 'mcc2.db');
await Directory(dirname(path)).create(recursive: true);
ByteData data = await rootBundle.load(join("assets", "mcc.db"));
List<int> bytes =
data.buffer.asUint8List(data.offsetInBytes, data.lengthInBytes);
await File(path).writeAsBytes(bytes, flush: true);
}
Future<MoorIsolate> _createMoorIsolate() async {
// this method is called from the main isolate. Since we can't use
// getApplicationDocumentsDirectory on a background isolate, we calculate
// the database path in the foreground isolate and then inform the
// background isolate about the path.
final dir = await getApplicationDocumentsDirectory();
final path = join(dir.path, 'mcc2.db');
final receivePort = ReceivePort();
await Isolate.spawn(
_startBackground,
_IsolateStartRequest(receivePort.sendPort, path),
);
// _startBackground will send the MoorIsolate to this ReceivePort
return (await receivePort.first as MoorIsolate);
}
void _startBackground(_IsolateStartRequest request) {
// this is the entry point from the background isolate! Let's create
// the database from the path we received
final executor = VmDatabase(File(request.targetPath), logStatements: true);
// we're using MoorIsolate.inCurrent here as this method already runs on a
// background isolate. If we used MoorIsolate.spawn, a third isolate would be
// started which is not what we want!
final moorIsolate = MoorIsolate.inCurrent(
() => DatabaseConnection.fromExecutor(executor),
);
// inform the starting isolate about this, so that it can call .connect()
request.sendMoorIsolate.send(moorIsolate);
}
// used to bundle the SendPort and the target path, since isolate entry point
// functions can only take one parameter.
class _IsolateStartRequest {
final SendPort sendMoorIsolate;
final String targetPath;
_IsolateStartRequest(this.sendMoorIsolate, this.targetPath);
}
query in the DAO
Future<List<FlightComposed>> getAllFlightsComposed() async {
final departureAirfields = alias(airfieldDtos, 'departureAirfields');
final arrivalAirfields = alias(airfieldDtos, 'arrivalAirfields');
var rows = await (select(flightDtos)
..orderBy([
(t) => OrderingTerm(expression: t.flightDate),
(t) => OrderingTerm(expression: t.departureTime),
]))
.join([
innerJoin(aircraftDtos,
aircraftDtos.aircraftCode.equalsExp(flightDtos.aircraftCode)),
innerJoin(
departureAirfields,
departureAirfields.airfieldCode
.equalsExp(flightDtos.departureAirfieldCode)),
innerJoin(
arrivalAirfields,
arrivalAirfields.airfieldCode
.equalsExp(flightDtos.arrivalAirfieldCode)),
innerJoin(pilotDtos, pilotDtos.pilotCode.equalsExp(flightDtos.picCode))
]).get();
var result = rows.map((row) {
var flightDetailed = FlightComposed(
flight: row.readTable(flightDtos),
aircraft: row.readTable(aircraftDtos),
departureAirfield: row.readTable(departureAirfields),
arrivalAirfield: row.readTable(arrivalAirfields),
pic: row.readTable(pilotDtos),
);
return flightDetailed;
}).toList();
return result;
}
How many flights are there, roughly speaking? There's some overhead that we can't avoid: Even if we're using isolates, we need to serialize and copy result data which takes time. For queries with huge result sets, isolates might even make things worse.
If you're loading lots of data at once, it might make sense to load data in chunks by introducing a LIMIT clause. On the ui side, packages like incrementally_loading_listview can help.
Dear Simon,
thanks a lot for your quick reply and suggestions.
There are only about 2300 flights, so I was a bit surprised that this would already be too much. I had also already tried limiting the flights to 100 per query, but this didn't make a big difference. However now I found out the culprit was probably that I was running the app in debug mode. If I run it in profile mode, the performance is already much better...
Brgds
However now I found out the culprit was probably that I was running the app in debug mode. If I run it in profile mode, the performance is already much better...
Is there documentation on this anywhere? Obviously I know that oftentimes with many frameworks/libraries debug performance can be worse, but hoping to understand reasoning specific to Drift. I am migrating from Isar (NoSQL) to Drift and that's for me the biggest thing I've noticed; very slow startup times. My use case is I have ~9000 finance transactions, each of which have about 15-20 properties. On app launch, these all are queried because the app does a bunch of calculations based on these transactions. There is probably room for optimization on my part, but I just want to be sure that there's nothing I'm missing.
Nothing immediately comes to mind, drift is not using a lot of costly assertions or anything else that would obviously be much slower on debug builds.
Are you loading the transactions one-by-one or in a single query? If anything interesting from drift pops out in the profiler I can also take a look at that. There is a cost from going from the Map<String, dynamic> given to us by the database to the data class instances, but doing that for 9000 rows shouldn't cause a lot of lags.
Yeah I've checked the CPU Profiler and don't see anything obvious. Majority of the time was spent on isolate communication but that is about typical even before introducing drift because I make heavy use of isolates when crunching numbers on the transaction data. I'm new to drift so may be doing things incorrectly, but my transaction parsing logic is like this:
Stream<List<Transaction>> watchTransactions({
required String budgetId,
}) {
final query = select(dbTransactions).join([
leftOuterJoin(
dbSubTransactions,
dbSubTransactions.transactionId.equalsExp(dbTransactions.uuid),
),
]);
query.where(dbTransactions.budgetId.equals(budgetId) & dbTransactions.isDeleted.equals(false));
query.orderBy([OrderingTerm.desc(dbTransactions.date)]);
return query.watch().map((rows) {
final collected = rows.parse(
dbTransactions,
(p0) => Transaction(
id: p0.uuid,
amount: p0.amount,
date: p0.date,
accountId: p0.accountId,
isDeleted: p0.isDeleted,
payeeName: p0.payeeName,
payeeId: p0.payeeId,
categoryId: p0.categoryId,
categoryName: p0.categoryName,
memo: p0.memo,
transferAccountId: p0.transferAccountId,
transferTransactionId: p0.transferTransactionId,
matchedTransactionId: p0.matchedTransactionId,
importId: p0.importId,
flagColor: p0.flagColor?.name,
subTransactions: List.empty(),
),
dbSubTransactions,
(p0) => SubTransaction(
id: p0.uuid,
transactionId: p0.transactionId,
amount: p0.amount,
isDeleted: p0.isDeleted,
payeeName: p0.payeeName,
payeeId: p0.payeeId,
categoryId: p0.categoryId,
categoryName: p0.categoryName,
memo: p0.memo,
transferAccountId: p0.transferAccountId,
transferTransactionId: p0.transferTransactionId,
),
);
return collected.entries.map((e) => e.key.copyWith(subTransactions: e.value.toList())).toList();
});
}
Parse is defined as an extension on the database instance:
// Parses the rows returned by a JOIN where each row
// is comprised of columns that can produce T and D when
// `readTable(orNull)` is called. The result is a map of
// T2 to a set of D2, which are produced when calling the
// `transformX` functions on T and D respectively.
Map<T2, Set<D2>> parse<T, D, T2, D2>(
TableInfo<Table, T> tableA,
T2 Function(T) transformA,
TableInfo<Table, D> tableB,
D2 Function(D) transformB,
) {
final map = <T2, Set<D2>>{};
for (final row in this) {
final key = transformA(row.readTable(tableA));
final value = row.readTableOrNull(tableB);
final existing = map[key] ?? {};
map[key] = value != null ? {...existing, transformB(value)} : existing;
}
return map;
}
It's very possible that collected.entries.map((e) => e.key.copyWith(subTransactions: e.value.toList())).toList(); is part of the lag here, but I don't recall seeing that in the profiler... I should look deeper.
That said, I do believe I've confirmed it IS for sure due to the number of transactions being processed, because it doesn't happen for smaller accounts with fewer transactions.
Just following up--I did a very rudimentary "Stopwatch test" and printed the elapsed time between when the query is invoked and when the function returns and it's about 5-6s which correlates to the initial lag I'm seeing during startup. Even worse, the vast majority of that time is the query execution itself. The parsing of rows is much less of a big deal which isn't what I expected. Here's the breakdown:
Total time: 5727ms/5.7s
Start query at 0ms
Received first rows at 5661ms: (98.84% of total time)
Parsed rows at 5708ms: (0.8% of total time)
Mapped to domain models at 5727ms (0.3% of total time)
So it looks to me like the actual query execution takes ~99% of the time, but I don't think I'm doing anything particularly interesting. FWIW the query returned 9371 rows in this case.
My use case is I have ~9000 finance transactions, each of which have about 15-20 properties. On app launch, these all are queried because the app does a bunch of calculations based on these transactions
What ends up happening with the result of the calculations? Is there any reason the calculations need to be done in dart instead of sqlite?
If you definitely need to load everything into dart memory, there are a couple things to keep in mind when optimizing.
- Do the columns you're joining on have an index?
- Do the columns in your WHERE clause have an index?
- How many rows total are there in the Transactions table? What’s the cardinality of the budgetId column?
- Your query selects the full Cartesian product of both tables - i.e if each of the 9000
Transactionhas 2SubTransactions, then your query would actually select 18000 rows of every column from both tables. That could be a lot of data - is all of it needed for processing? Could you query each table separately and do the "join" in dart?
In general though I'd recommend looking for ways to reduce the volume of data that's being touched: Can the UI be redesigned? No UI can show 9000 items. Can the "processing" be an SQL update? Get the database engine to do as much processing as possible.
What ends up happening with the result of the calculations? Is there any reason the calculations need to be done in dart instead of sqlite?
I actually had hopes of offloading a lot of the processing currently done in dart to sqlite as I think much of the logic can be done just as easily in SQL. Currently much of the logic I'm doing is done on a background Isolate due to the number of transactions. I'm utilizing package:actors for this and it works great. That said, I need to figure out why there's so much UI jank with a simple JOIN (likely my fault, I'm not blaming the package) before I feel confident enough to go forward with migrating logic to the query level.
Do the columns you're joining on have an index?
They didn't, but I added one and it did not help.
Do the columns in your WHERE clause have an index?
Same story as above. I will add that I haven't done anything database related since college, so I'm refreshing myself on everything. Here are the indexes I added:
// On Transactions
@TableIndex(name: 'idx_txn', columns: {#budgetId, #isDeleted})
// On SubTransactions
@TableIndex(name: 'idx_sub_txn', columns: {#budgetId, #transactionId, #isDeleted})
How many rows total are there in the Transactions table?
Since this was a select all it's safe to say it's very close to if not exactly the 9300 rows I mentioned above. I have not yet figured out how to dump the database (I could obviously do this somewhat in dart code) from the Simulator and inspect it using some sqlite viewer but will do that eventually.
What’s the cardinality of the budgetId column?
Disclaimer: I had to look up cardinality, but the answer is "I believe extremely low". Of the ~9000 transactions, 99% of them are under a single budget id. The remainder are assigned to one of 3-4 other possible budget ids.
Your query selects the full Cartesian product of both tables - i.e if each of the 9000 Transaction has 2 SubTransactions, then your query would actually select 18000 rows of every column from both tables.
Fair point. The vast majority of transactions will not have sub-transactions, but still a valid point
That could be a lot of data - is all of it needed for processing?
I am sure I could optimize this a bit. Some calculations DO use all the transactions (to determine values since the "beginning of time"), but maybe I could change how this is done. To reiterate, I'm coming from Isar (NoSQL) and this was not an issue with Isar, even though it's not optimal. The UI freeze is only noticeable with drift.
Could you query each table separately and do the "join" in dart?
This is what I was doing with Isar, but the actual hang is happening in the query and not the "join". That's weird to me because AFAIK the default setup (createInBackground) uses an Isolate for the query, so this feels odd to me.
Huh yeah that does seem odd - optimizing a slow query is one thing, but I overlooked that the primary issue is UI jank despite using isolates 😅
I'd be curious to see how the drift_sqflite backend performs in this scenario. I think all you'd need to do is swap out your NativeDatabase constructor with a SqfliteQueryExecutor.
Huh yeah that does seem odd - optimizing a slow query is one thing, but I overlooked that the primary issue is UI jank despite using isolates 😅
I'd be curious to see how the
drift_sqflitebackend performs in this scenario. I think all you'd need to do is swap out yourNativeDatabaseconstructor with aSqfliteQueryExecutor.
Gave this a shot this morning. Like you said, just swapped out NativeDatabase with return SqfliteQueryExecutor(path: file.path);
Unfortunately, no improvement 😞
I'll try to reproduce this as well, but it'll probably take me until the end of the week.
Mapped to domain models at 5727ms (0.3% of total time)
Do you define the mapping as the step where rows.parse is called? At that point drift will have already mapped the underlying rows to its own classes for the two tables. This step happens on the main isolate after the query ran and we have the raw results. So if that is how the stopwatch is set up, it might not report that the actual query is pretty fast and it takes drift too long to map the results.
I'll try to reproduce this as well, but it'll probably take me until the end of the week.
Mapped to domain models at 5727ms (0.3% of total time)
Do you define the mapping as the step where
rows.parseis called? At that point drift will have already mapped the underlying rows to its own classes for the two tables. This step happens on the main isolate after the query ran and we have the raw results. So if that is how the stopwatch is set up, it might not report that the actual query is pretty fast and it takes drift too long to map the results.
Sorry my timeline wasn't super great. The line that printed out:
Received first rows at 5661ms: (98.84% of total time)
Was the line before rows.parseis called. I printed out the Isolate.current#debugName and it was indeed main like you said, so everything that happened in map is on the main isolate. My read on that was that the time between query.watch() being called and map being called (aka query execution) is the root of the slowdown here, but let me know if you disagree.
My read on that was that the time between
query.watch()being called andmapbeing called (aka query execution) is the root of the slowdown here
I agree. But when map is called, drift will have mapped the query results into its own data classes generated from your table definitions already. These are the p0 objects you're mapping into your own classes afterwards.
So the measurement unambiguously reveals that it takes a long time for drift to run the query and map results to drift-generated classes. But we don't know how much of that time is spent executing the query vs. drift doing its mapping. But I suspect the latter is taking most of the time.
You could measure the time for the actual execution with
class TimingInterceptor extends QueryInterceptor {
@override
Future<List<Map<String, Object?>>> runSelect(
QueryExecutor executor, String statement, List<Object?> args) async {
final sw = Stopwatch()..start();
final rows = await super.runSelect(executor, statement, args);
sw.stop();
print('Ran $statement in ${sw.elapsed}');
return rows;
}
}
(you can apply this by passing NativeDatabase.createInBackground(...).interceptWith(TimingInterceptor()) to the database class)
OK that's super helpful, and has returned some interesting data. I slightly tweaked the print message because the elapsed time was getting pushed outside the debug message length:
if (sw.elapsed > 1.seconds) {
print('BRANDON: ${sw.elapsed} (${rows.length}) : $statement');
}
Printed:
flutter: BRANDON: 0:00:06.431995 (1) : SELECT * FROM "db_spend_trackers" ORDER BY "name" ASC;
This table has 1 row in it 👀 So now I gotta figure out what I've done to cause this haha:
Here's the table definition:
class DbSpendTrackers extends Table {
IntColumn get id => integer().autoIncrement()();
TextColumn get name => text()();
TextColumn get nickName => text().nullable()();
TextColumn get sourceId => text()();
TextColumn get budgetId => text()();
BoolColumn get isDeleted => boolean()();
TextColumn get type => textEnum<SpendTrackerType>()();
}
Yeah this is pretty wild. I used an sqlite viewer to view the DB and everything looks good. The same SQL statement run from that viewer runs in ~10ms.
OK Update: I commented out the query for the transactions table and the one for the above table began executing much faster, so it was a red herring that it took so long in my timings above.
Assuming a single Isolate (i.e that drift hasn't spun up a pool, which from my list of Isolates displayed by my IDE seems like the case), this just means the transactions query is blocking the execution of the other queries. That's fine, but doesn't explicitly explain the jank. So I'm still trying to understand that part.
Additionally, the timing/speed of the transactions table query is fast, so it's definitely something else related to that query that's causing the issue, and it must be happening on the main isolate.
Editing to add: This does not happen in profile (so presumably release) mode I think that was stated at the beginning of the thread but just wanted to reiterate.
this just means the transactions query is blocking the execution of the other queries
Ah true - the background isolate will serve requests in the order in which they come in, but can only serve on at the time. So if the main isolate is making requests concurrently, some queries will seemingly take longer than they should because they need to wait for their turn. Installing the timing interceptor on the background isolate should yield more accurate measurements:
LazyDatabase(() async {
final file = File(/*...*/);
final isolate = await DriftIsolate.spawn(() {
return NativeDatabase(file).interceptWith(TimingInterceptor());
});
return await isolate.connect();
});
But still, this doesn't explain why this only happens in debug builds - it has to be something happening in Dart.
Yeah I've checked the CPU Profiler and don't see anything obvious. Majority of the time was spent on isolate communication but that is about typical even before introducing drift because I make heavy use of isolates when crunching numbers on the transaction data.
Is there a way to start and stop the profile so that it primarily measures the transaction query, or is that impossible due to concurrent activity in the application? Do you see the time spent in JoinedSelectStatement._mapRow or $TransactionsTable.map, does anything stand out there?
OK I appreciate you pointing me back to the CPU Profiler, as there are some interesting things happening there:
JoinedSelectStatement._columnsWithName
JoinedSelectStatement._mapRow
InsertStatement.createContext
The last of these jumps out to me because I am probably doing something sub-optimal in how I'm handling updates to the transactions table. The API I'm hitting supports "delta" requests, so basically the process is:
- Hit the API to fetch all transactions on first launch (on my account this equates to ~9k transactions). In this payload there is an
intthat can be provided in the next request and you will only receive transaction that have changed (modified or deleted) since thatint - At each subsequent app launch (or every X minutes) the app hits the API again and provides the
int. This generally only returns a handful of transactions, but obviously could return many if the user did some kind of bulk modification.
For both of these inserts/updates I'm using a transaction that calls insertOnConflictUpdate and passes a DbTransactionsCompanion.insert with all fields from the data returned by the API. I do this for both Transactions and SubTransactions. I wonder if this is not the right way to go about this. Should I insert only on the first bullet above and replace only on the second?
Thanks for providing the details from the profiler! Spending so much time in _columnsWithName is ridiculous, there really is no reason to compute that over and over again for every row (which will have the same column names as all other rows) - fixed in 2d00149cb90da1187a36f1fe96e2e27e3ae30827.
I do this for both
Transactions andSubTransactions. I wonder if this is not the right way to go about this.
I don't expect replace to have a different performance profile. At the moment, a source of inefficiency is that every insert creates its own SQL. When issuing many inserts with the same "shape" (columns appearing in the VALUES clause), we could potentially use that to not generate the SQL multiple times. But InsertStatement.createContext delegates most of its work to other functions not showing up in the profiler - nothing in that method stands out to me directly. And updates (through replace) will likely have the same issue.
I just did a test against that commit and here's what I saw:
BEFORE
Test 1
flutter: \^[[38;5;208m[W]\^[[0m Slow query (670ms)! SELECT "db_transactions"."uuid" AS "db_transac<…>
flutter: \^[[38;5;208m[W]\^[[0m Slow query (7827ms)! SELECT * FROM "db_spend_trackers" WHERE "budge<…>
Test 2
flutter: \^[[38;5;208m[W]\^[[0m Slow query (806ms)! SELECT "db_transactions"."uuid" AS "db_transac<…>
flutter: \^[[38;5;208m[W]\^[[0m Slow query (7341ms)! SELECT * FROM "db_spend_trackers" WHERE "budge<…>
AFTER
Test 1
flutter: \^[[38;5;208m[W]\^[[0m Slow query (789ms)! SELECT "db_transactions"."uuid" AS "db_transac<…>
flutter: \^[[38;5;208m[W]\^[[0m Slow query (1440ms)! SELECT * FROM "db_spend_trackers" WHERE "budge<…>
Test 2
flutter: \^[[38;5;208m[W]\^[[0m Slow query (527ms)! SELECT "db_transactions"."uuid" AS "db_transac<…>
flutter: \^[[38;5;208m[W]\^[[0m Slow query (1644ms)! SELECT * FROM "db_spend_trackers" WHERE "budge<…>
As a reminder, the log logic is:
if (sw.elapsed > 500.milliseconds) {
logger.w('Slow query (${sw.elapsedMilliseconds}ms)! ${statement.substring(0, 50)}');
}
~The BEFORE has fewer queries over the 500 threshold BUT the AFTER definitely took a big chunk out of the slowest query (which was likely just held up due to the Transactions query). This was without the interceptor installed on the Isolate, I'll post another comment if that ends up making a difference in these timings.~
Edit: I updated all timings to use the interceptor on the isolate because I think it painted a more accurate picture of the impacts of the change!
Do you think the remaining slowness is due to the issue you mentioned above?
At the moment, a source of inefficiency is that every insert creates its own SQL. When issuing many inserts with the same "shape" (columns appearing in the VALUES clause), we could potentially use that to not generate the SQL multiple times...
Sorry to keep nagging you with the profiler, but is any change visible there? I'm actually surprised to see this impact the measured times on the isolate at all, since all of the result mapping is happening on the main isolate.
For the inserts, I agree that the duplicate query building could be a source of remaining slowness. If you're not using them already, I suggest taking a look at batches for bulk operations though. They won't fix the duplicate building problem, but they will bundle everything into a single message sent to the executor isolate, so there's less async overhead compared to many individual statements.
Sorry to keep nagging you with the profiler, but is any change visible there?
Here's what I'm seeing now:
BaseExecuter.runRequest
BaseExecuter.runRequest (again)
DriftCommunication.send
JoinedSelectStatement.mapRow
InsertStatement.createContext
I suggest taking a look at batches for bulk operations though
I'll take a look at batches, thanks for recommending them!
Here are timings after moving the heaviest insertion procedure to a batch (insertTransactions inserts ~9k entities, the rest are only inserting ~1-50 entities and don't even show up on the CPU profiler).