taskwarrior
taskwarrior copied to clipboard
3.0.0 Performance collapse
Hello.
So I've just upgraded to 3.0 and immediately noticed huge performance drop in Taskwarrior on my system.
I primarily use TW on my Raspberry PI 5 home server, which is not something you would expect great performance from. Anyway, new TW clearly shows that something is wrong in its new implementation.
- What command(s) did you run?
I've run simple time task list
commands with 2.6.2 and 3.0.0 with 744 pending tasks. This is what I get with the old version:
task list > /dev/null 0.08s user 0.00s system 99% cpu 0.085 total
And this is from 3.0:
./task list > /dev/null 2.01s user 0.13s system 98% cpu 2.176 total
As we can see, new Taskwarrior performs about 25 times slower than the previous version.
- What actually happened?
I've run TWs under strace and noticed that there are over 160k syscalls in the new version, most of which relate to some IO of SQLite DB. Each of syscalls roughly takes about 10-14us, which leads to almost 2 seconds of context switches and random page cache IO.
I've also tested direct SQL query with sqlite3 taskchampion.sqlite3 "select * from tasks where json_extract(data, '$.status') = 'pending';" > /dev/null
, which ran without performance problems at all:
sqlite3 taskchampion.sqlite3 > /dev/null 0.01s user 0.00s system 94% cpu 0.012 total
- What can I say?
So it seems that new Taskwarrior just bombards DB with queries for each task it gets. If that's the case, this is not something you would want from synchronous SQL operations.
- Paste the output of the
task diag
command.
Old:
task 2.6.2
Platform: Linux
Compiler
Version: 12.1.0
Caps: +stdc +stdc_hosted +LP64 +c8 +i32 +l64 +vp64 +time_t64
Compliance: C++17
Build Features
CMake: 3.26.3
libuuid: libuuid + uuid_unparse_lower
libgnutls: 3.8.0
Build type:
Configuration
File: /home/er/.config/task/taskrc (found), 2090 bytes, mode 100644
Data: /home/er/.local/share/task (found), dir, mode 40755
Locking: Enabled
GC: Enabled
$EDITOR: helix
Server:
CA: -
Certificate: -
Key: -
Trust: strict
Ciphers: NORMAL
Creds:
Hooks
System: Enabled
Location: /home/er/.config/task/hooks
Active: on-modify.timewarrior (executable)
Inactive:
Tests
Terminal: 261x127
Dups: Scanned 3597 tasks for duplicate UUIDs:
Found duplicate 90cc89aa-1b0f-47d1-888f-4e9ca3616dff
Broken ref: Scanned 3597 tasks for broken references:
No broken references found
New:
task 3.0.0
Platform: Linux
Compiler
Version: 12.1.0
Caps: +stdc +stdc_hosted +LP64 +c8 +i32 +l64 +vp64 +time_t64
Compliance: C++17
Build Features
Commit: 3e41fb604
CMake: 3.29.0
libuuid: libuuid + uuid_unparse_lower
Build type:
Configuration
File: /home/er/.config/task/taskrc (found), 2090 bytes, mode 100644
Data: /home/er/.local/share/task (found), dir, mode 40755
Locking: Enabled
GC: Enabled
$EDITOR: helix
Hooks
System: Enabled
Location: /home/er/.config/task/hooks
Active: on-modify.timewarrior (executable)
Inactive:
Tests
Terminal: 261x127
Dups: Scanned 3607 tasks for duplicate UUIDs:
No duplicates found
Broken ref: Scanned 3607 tasks for broken references:
No broken references found
- P.S.
I think you had already heard some complaints about this release, but this is a perfect example of how you would not want to make one. I use this tool for about two years and genuinely love it. This tool does everything it should and nothing else, and I really appreciate all the work that has been done here by devs and community. But this is really frustrating to see how this update just dropped on heads of it's users without proper testing, gradual rollout and automation tools. Please, don't make the same mistake twice.
After solving my issue in #3330 I have to say I am having much the same issue. Basic stuff like task ready
with my list of ~4,500 tasks is virtually unusable - I'm talking running the command and going and getting a cup of coffee and a meal and managing to drink and eat it kind of slow (I'll get you some time
and task debug output in an hour or however long it takes...)
It's realistically a barrier to me actually using 3.0. So I will probably have to downgrade for now. I'm happy to help in any way I can in terms of testing and stuff, but my know-how is severely limited wrt. anything needed for development.
That all said, I am glad to see work is ongoing on the project, and as noted in another update-related thread, I am sorry that you usually only hear from the userbase when we are unhappy!
Are you using any hooks on the database?
I am mainly asking as in #3312 as well as in #3314, both undesired behaviour is when there are hooks enabled, and I also experience some strange reporting on the modification count when adding a new task. (Adding one new task results in four "local" changes; I would expect to have one change)
I'm having the same issue.
Command | task 3.0.0 |
---|---|
task ready | 3.629s |
task list | 14.127s |
task next | 1.889s |
task calendar | 1.970s |
The old task list isn't available for comparison, but it always took less than a second.
strace task next
outputs 31205 lines.
I have a lot of uda's, due to bugwarrior synchronizing with gitlab. However, the times don't change when reducing the taskrc to a minimum.
task diagnostics
task 3.0.0
Platform: Linux
Compiler
Version: 12.1.0
Caps: +stdc +stdc_hosted +LP64 +c8 +i32 +l64 +vp64 +time_t64
Compliance: C++17
Build Features
Commit: 3e41fb604
CMake: 3.29.0
libuuid: libuuid + uuid_unparse_lower
Build type:
Configuration
File: /home/ghost/.config/task/taskrc (found), 1933 bytes, mode 100755
Data: /home/ghost/.local/state/ (found), dir, mode 40700
Locking: Enabled
GC: Enabled
$VISUAL: vim
Hooks
System: Enabled
Location: /home/ghost/.config/task/hooks
Active: on-modify (executable) (symlink)
on-modify.timewarrior (executable) (symlink)
Inactive:
Tests
Terminal: 241x45
Dups: Scanned 6114 tasks for duplicate UUIDs:
No duplicates found
Broken ref: Scanned 6114 tasks for broken references:
No broken references found
Are you using any hooks on the database?
I am not using any hooks, but hooks are enabled.
$ task diag
task 3.0.0
Platform: Linux
Compiler
Version: 13.2.1 20230801
Caps: +stdc +stdc_hosted +LP64 +c8 +i32 +l64 +vp64 +time_t64
Compliance: C++17
Build Features
Commit: 8a0a98d3e
CMake: 3.29.0
libuuid: libuuid + uuid_unparse_lower
Build type:
Configuration
File: /home/hearth/.taskrc (found), 4718 bytes, mode 100644
Data: /home/hearth/.task (found), dir, mode 40755
Locking: Enabled
GC: Enabled
Hooks
System: Enabled
Location: /home/hearth/.task/hooks
(-none-)
Tests
Terminal: 175x60
Dups: Scanned 4850 tasks for duplicate UUIDs:
No duplicates found
Broken ref: Scanned 4850 tasks for broken references:
No broken references found
Memory usage can also be a bit bonkers. Running the following:
task pri:5.000000 mod pri:5
on around 3,300 tasks resulted in all of my RAM being eaten through and my computer crashing - though, granted, it took a very long time to get there, over an hour. The database is around 22MB in size. I have more RAM than taskwarrior should ever need.
$ time task ready >/dev/null
There are 95472 local changes. Sync required.
________________________________________________________
Executed in 20.13 mins fish external
usr time 19.67 mins 332.00 micros 19.67 mins
sys time 0.42 mins 145.00 micros 0.42 mins
$ time task rc.hooks=0 >/dev/null
Configuration override rc.hooks=0
There are 95472 local changes. Sync required.
________________________________________________________
Executed in 20.04 mins fish external
usr time 19.60 mins 0.00 micros 19.60 mins
sys time 0.41 mins 390.00 micros 0.41 mins
$ task rc.debug=1 ready
Timer Config::load (/home/hearth/.taskrc) 0.000174 sec
No context set
Hook directory not readable: /home/hearth/.task/hooks
Parse Tree (before command-specifіc processing)
_original_args
task rc.debug=1 ready
_args
word basename='task' raw='task' BINARY
pair modifier='debug' name='rc' raw='rc.debug=1' separator='=' value='1' CONFIG ORIGINAL
identifier canonical='ready' raw='ready' ORIGINAL CMD READONLY SHOWSID RUNSGC USESCONTEXT ALLOWSFILTER
No context set
Filtered 4850 tasks --> 1618 tasks [all tasks]
ViewTask::render min=75 ideal=303 overage=119 width=194
Perf task 3.0.0 8a0a98d3e 20240403T215306Z init:847 load:0 gc:48105 filter:10014 commit:0 sort:1203478756 render:19919182 hooks:0 other:1668783 total:1225125687
Decided to make the hook directory and give it rwx permissions. No improvement.
Here is the same set of commands on task 2.6.2:
$ task diag
task 2.6.2
Platform: Linux
Compiler
Version: 13.1.1 20230429
Caps: +stdc +stdc_hosted +LP64 +c8 +i32 +l64 +vp64 +time_t64
Compliance: C++17
Build Features
CMake: 3.26.3
libuuid: libuuid + uuid_unparse_lower
libgnutls: 3.8.0
Build type:
Configuration
File: /home/hearth/.taskrc (found), 4718 bytes, mode 100644
Data: /home/hearth/.task (found), dir, mode 40755
Locking: Enabled
GC: Enabled
Server:
CA: -
Certificate: -
Key: -
Trust: strict
Ciphers: NORMAL
Creds:
Hooks
System: Enabled
Location: /home/hearth/.task/hooks
(-none-)
Tests
Terminal: 302x51
Dups: Scanned 4850 tasks for duplicate UUIDs:
No duplicates found
Broken ref: Scanned 4850 tasks for broken references:
No broken references found
$ time task ready >/dev/null
________________________________________________________
Executed in 2.61 secs fish external
usr time 2.52 secs 251.00 micros 2.52 secs
sys time 0.08 secs 150.00 micros 0.08 secs
$ time task rc.hooks=0 ready >/dev/null
Configuration override rc.hooks=0
________________________________________________________
Executed in 2.59 secs fish external
usr time 2.53 secs 421.00 micros 2.53 secs
sys time 0.06 secs 85.00 micros 0.06 secs
$ task rc.debug=1 ready
Timer Config::load (/home/hearth/.taskrc) 0.000311 sec
No context set
Parse Tree (before command-specifіc processing)
_original_args
task rc.debug=1 ready
_args
word basename='task' raw='task' BINARY
pair modifier='debug' name='rc' raw='rc.debug=1' separator='=' value='1' CONFIG ORIGINAL
identifier canonical='ready' raw='ready' ORIGINAL CMD READONLY SHOWSID RUNSGC USESCONTEXT ALLOWSFILTER
No context set
Filtered 4850 tasks --> 1618 tasks [all tasks]
ViewTask::render min=75 ideal=303 overage=119 width=194
pending.data rw - T1946+000~000-000 L1946+000
completed.data rw - T2904+000~000-000 L2904+000
undo.data rw - T0000+000~000-000 L0000+000
backlog.data rw - T0000+000~000-000 L0000+000
Perf task 2.6.2 - 20240403T220157Z init:1046 load:33314 gc:370795 filter:6333 commit:42 sort:2043416 render:17276 hooks:0 other:61495 total:2533717
There may be a few issues here.
In general, performance is weak right now. Part of that should be relatively easy to fix -- we do a lot of individual queries, especially when modifying data, that could just as well be handled in a single transaction. There may be some unanticipated scaling issues, both in terms of the number of tasks (BTW, is that 4,500 pending tasks, @ashprice, or does that include completed?) an number of operations ("There are 95472 local changes"). I don't think we have anything that scales with the number of operations, but maybe I've missed something there. I have a vague recollection that some DBs accomplish count(*) by scanning all rows, which would be expensive for 95,472 rows! So, let's see if we scan split those out into individual issues. Running a task sync
with local sync configured should take care of those 95k rows, so if things speed up at that point then we know leaving a lot of un-sync'd operations leads to poor performance.
As I've said elsewhere, developers for Taskwarrior are thin on the ground right now, so I appreciate the kind words and support.
That is 4,850 total tasks, 1,769 are pending. I thought it was pretty damn big but I see from some of the other threads here, that other people have even double this.
The backlog is quite high, yeah. I basically never used 2.6.2 and before's built-in sync - I've always just made sure I'm not doing any tw related i/o and then ran something like syncthing or rsync on the directory. Careful use meant that never caused any issue, even with recurrence. Occasionally I'd delete the backlog file and even strip out purged tasks from undo.data with tools like ripgrep, sed, and uniq. (Pretty sure I never broke anything in ways that is relevant here.)
I set up the 3.0.0 sync to a local directory and synced without issue. Unfortunately, performance is much the same. (I'll edit in the output when it arrives!)
OK, thanks for checking!
In case it helps, too, here are my current task stats
:
Category Data
Pending 1769
Waiting 4
Recurring 173
Completed 1984
Deleted 920
Total 4850
Annotations 66
Unique tags 36
Projects 30
Blocked tasks 151
Blocking tasks 244
Undo transactions 0
Sync backlog transactions 0
Tasks tagged 72.6%
Oldest task 2022-08-13
Newest task 2024-04-03
Task used for 1.6y
Task added every 2h
Task completed every 7h
Task deleted every 15h
Average time pending 3mo
Average desc length 30 characters
I'm stressed just thinking of having 1,769 pending tasks! I added #3334 to track this particular perf issue.
Running a
task sync
with local sync configured should take care of those 95k rows
Is there a way to disable sync entirely so that it just skips over any sync-related counts or logic? I looked through the man pages and didn't see anything. In my use case, I don't need to sync the data to any other devices, so it would be superfluous to set up a local sync that has no real use.
Yes, that's #3297.
i had a look into the sqlite file because vit performance is horrible rn, and
sqlite> PRAGMA table_info(tasks);
0|uuid|STRING|0||1
1|data|STRING|0||0
sqlite> select * from tasks limit 1;
03fcc33c-52ae-44a3-aaf4-3af177e90bf0|{"status":"deleted","end":"1711502131","description":"test","modified":"1711502132","entry":"1711502121"}
the task database is just two columns, where the second column is JSON for all the real task data? so we have to reindex the whole db every time the user wants to run a filter, or check what tags exist, etc? what even is the point of using sqlite then?
with all due respect and tons of love for this project ... wtf.
Okay, forget my complaint about not using the db as a db for now. ElectricR's comment about bombarding the db with queries is way more important.
If I'm reading correctly (I probably am not, since I'm unfamiliar with the codebase and with rust), I see that
-
TBD2::pending_tasks()
callsreplica.get_task
for all tasks in the working set.- I think whoever wrote that code assumed that the replica would be an in-memory replica, so looping through the whole db would be fast.
- But this seems to not be case.
a.
open_replica
callstc::Replica::Replica
(the version with 2 args), b. which callstc_replica_new_on_disk
, c. which callsTCReplica::from(Replica::new(storage))
wherestorage
isStorageConfig::OnDisk
d. I gave up on following the rabbit hole here but I don't think this ends up copying the db to memory.
-
replica.get_task
then callsself.taskdb.get_task
, -
which I am assuming ends up being this one because we're seeing so much IO on the sqlite db:
fn get_task(&mut self, uuid: Uuid) -> Result<Option<TaskMap>> {
let t = self.get_txn()?;
let result: Option<StoredTaskMap> = t
.query_row(
"SELECT data FROM tasks WHERE uuid = ? LIMIT 1",
[&StoredUuid(uuid)],
|r| r.get("data"),
)
.optional()?;
// Get task from "stored" wrapper
Ok(result.map(|t| t.0))
}
So, tldr I think there are a lot of cases where we are doing something like (pseudocode)
for u in pending_uuids
SELECT data FROM tasks WHERE uuid = u LIMIT 1
which ends up killing performance.
Sorry if I'm being annoying with this. Just trying to make sense of it.
Yep, your analysis is correct. There's some work in the Taskchampion repo that will help write performance, If you'd like to work on read performance, please do!
Yep, your analysis is correct. There's some work in the Taskchampion repo that will help write performance, If you'd like to work on read performance, please do!
It's been a minute since I've programmed anything in C and I'm completely unfamiliar with FFI, so let me know if there's anything I've misunderstood.
From what I can tell, the read performance could be improved by adding a get_tasks
function to the StorageTxn
trait :
fn get_tasks(&mut self, uuid: Vec<Uuid>) -> Result<Vec<Option<TaskMap>>>
This loop could then simply push the uuid to a std::vector
and pass that to get_tasks
. The maybe_task.is_some()
could potentially happen on the rust side of things as well, in which case get_tasks
would simply return Result<Vec<TaskMap>>
Would this be a workable solution?
Yep! The pseudocode in @imyxh's comment is, essentially, TDB2::pending_tasks
. And, that's used a lot!
BTW, like most DBs, SQLite is page-based, and pretty quickly pages its DB into memory. With a decent OS, those stay in memory, so at some level we are reading from memory. However, there's a lot of overhead to lock and open the DB (see also #3418), and a lot of overhead to parse each query and return the result, so this pending_tasks
is certainly not as fast as it could be. But, it may not be the worst offender in terms of performance -- some profiling would be required to determine that. Probably running a profiler is the action item from this issue.
We already have Replica::all_tasks
and all_task_data
, so it may make sense to add Replica::pending_tasks
and pending_task_data
which could then join on the working_set table (as opposed to taking an arbitrary list of UUIDs).
I suspect the implementation of this would be relatively straightforward, and based on existing examples at both the Replica
and underlying TaskDB
levels in TaskChampion. The connection to C++ is now much simpler than it used to be, using cxx, so that should also be relatively straightforward. And then calling the result from TDB2 is also pretty easy. The tricky bit is implementing it in TaskChampion and then developing against that in Taskwarrior, since they are different repos -- but I can help with that. This is definitely worth doing even if it's not the biggest contributor to poor performance. Do you want to give it a shot? If so, let's open a new issue to track the work.
UPDATE: I created https://github.com/GothenburgBitFactory/taskchampion/issues/452 for that work. Feel free to comment there and I can assign it to you!
I think I'll close this bug out, as it's more of a theme than an issue with a resolution. I think anything actionable here has either been addressed or is covered in https://github.com/GothenburgBitFactory/taskwarrior/labels/topic%3Aperformance. If not, please add new issues!