anki icon indicating copy to clipboard operation
anki copied to clipboard

Speed up update_memory_state by batching FSRS memory state calculation

Open DanielPechersky opened this issue 3 months ago • 34 comments

Break down update_memory_state into three cases:

  1. Clear FSRS data for cards
  2. Update memory state for cards without items
  3. Update memory state for cards with items

Use fsrs.memory_state_batch over card.set_memory_state when handling case 3 for performance gains.

DanielPechersky avatar Sep 14 '25 11:09 DanielPechersky

TODO: I don't update progress correctly

DanielPechersky avatar Sep 14 '25 12:09 DanielPechersky

assuming card.set_memory_state is much slower than everything else.

Just a guess, but I think retrieving and updating each card individually might be the slowest steps. So, we might get a speedup by retrieving and updating cards in batches. Trying to process all cards at once may cause out of memory issues though, if the user has millions of cards.

let mut card = self.storage.get_card(card_id)?.or_not_found(card_id)?;
self.update_card_inner(&mut card, original, usn)?;

PS: I am not a developer.

user1823 avatar Sep 14 '25 12:09 user1823

@user1823 true, I got the impression off the discord that the FSRS code is the "known criminal" here, but there's no benchmark so the function could be slow for any reason, including IO reasons. Obviously batching IO is a good idea too. Btw 10,000,000 cards = ~~351 MB~~3 GB. ~~Not much~~Substantial, but the workaround is simple: Batch cards while processing them. I find it unlikely that the memory usage will be a problem though.

DanielPechersky avatar Sep 14 '25 12:09 DanielPechersky

My change will allocate an extra 3.51 MB of scratch space for every 10,000 cards the function processes.

Btw 10,000,000 cards = 351 MB

If 10k cards is 3.51 MB, then 1000 more than that should be 3.51 GB, no?

Expertium avatar Sep 14 '25 12:09 Expertium

@Expertium Truly, my zero counting skills are legendary. You're right

DanielPechersky avatar Sep 14 '25 13:09 DanielPechersky

Can someone check whether the slowest part is FSRS and not fetching the card's review history from wherever Anki stores it?

Expertium avatar Sep 14 '25 19:09 Expertium

Commit 27f3ecf8651ff63746b2d8cc0b0e4caf2672d491 is a basic implementation that uses the new FSRS function memory_state_batch. Commit 74665ef2a8548bbdb5c8386d9f2569ff65de5239 expands on it and batches items (to prevent using too much memory) as well as batching the inputs to FSRS for potential efficiency.

DanielPechersky avatar Sep 15 '25 18:09 DanielPechersky

When I were saving the preset, this error occurred:

Anki 25.09 (c5036e20) (ao)
Python 3.11.6 Qt 6.9.1 PyQt 6.9.1
Platform: macOS-14.5-arm64-arm-64bit

Traceback (most recent call last):
  File "/Users/jarrettye/Codes/anki/qt/aqt/progress.py", line 198, in on_progress
    self.update(label=update.label, value=update.value, max=update.max)
  File "/Users/jarrettye/Codes/anki/qt/aqt/progress.py", line 228, in update
    self._win.form.progressBar.setValue(self._counter)
OverflowError: argument 1 overflowed: value must be in the range -2147483648 to 2147483647

L-M-Sherlock avatar Sep 16 '25 02:09 L-M-Sherlock

It's nice to combine the progress. Here is my modification:

--- a/rslib/src/scheduler/fsrs/memory_state.rs
+++ b/rslib/src/scheduler/fsrs/memory_state.rs
@@ -17,2 +17,3 @@ use crate::card::CardType;
 use crate::prelude::*;
+use crate::progress::ThrottlingProgressHandler;
 use crate::revlog::RevlogEntry;
@@ -67,3 +68,3 @@ impl<T> ChunkIntoVecs<T> for Vec<T> {
         std::iter::from_fn(move || {
-            (!self.is_empty()).then(|| self.split_off(chunk_size.min(self.len())))
+            (!self.is_empty()).then(|| self.drain(..chunk_size.min(self.len())).collect())
         })
@@ -118,2 +119,5 @@ impl Collection {
 
+            let mut progress = self.new_progress_handler::<ComputeMemoryProgress>();
+            progress.update(false, |s| s.total_cards = items.len() as u32)?;
+
             let (items, cards_without_items): (Vec<(CardId, FsrsItemForMemoryState)>, Vec<CardId>) =
@@ -148,2 +152,3 @@ impl Collection {
                 usn,
+                &mut progress,
             )?;
@@ -230,2 +235,3 @@ impl Collection {
                 usn,
+                &mut progress,
             )?;
@@ -257,5 +263,4 @@ impl Collection {
         usn: Usn,
+        progress: &mut ThrottlingProgressHandler<ComputeMemoryProgress>,
     ) -> Result<()> {
-        let mut progress = self.new_progress_handler::<ComputeMemoryProgress>();
-        progress.update(false, |s| s.total_cards = cards.len() as u32)?;
         for (idx, card_id) in cards.into_iter().enumerate() {
@@ -278,2 +283,3 @@ impl Collection {
         usn: Usn,
+        progress: &mut ThrottlingProgressHandler<ComputeMemoryProgress>,
     ) -> Result<()> {
@@ -285,5 +291,2 @@ impl Collection {
         let mut starting_states = Vec::new();
-
-        let mut progress = self.new_progress_handler::<ComputeMemoryProgress>();
-        progress.update(false, |s| s.total_cards = items.len() as u32)?;
         for (i, items) in items.chunk_into_vecs(ITEM_CHUNK_SIZE).enumerate() {

L-M-Sherlock avatar Sep 16 '25 04:09 L-M-Sherlock

Looks good, feel free to commit that

DanielPechersky avatar Sep 16 '25 05:09 DanielPechersky

I'm not a maintainer of this repo, so I cannot commit it to this PR.

L-M-Sherlock avatar Sep 16 '25 06:09 L-M-Sherlock

Here is the samply record: https://share.firefox.dev/41TslxP

L-M-Sherlock avatar Sep 16 '25 10:09 L-M-Sherlock

Do not merge: waiting on fsrs-rs 5.2.0

DanielPechersky avatar Sep 18 '25 14:09 DanielPechersky

Do not merge: waiting on fsrs-rs 5.2.0

It was published yesterday: https://crates.io/crates/fsrs/5.2.0

L-M-Sherlock avatar Sep 18 '25 15:09 L-M-Sherlock

Ready to merge, but no snapshot test and I have not manually tested it. Not sure if I'll have time to implement the snapshot test.

DanielPechersky avatar Sep 18 '25 16:09 DanielPechersky

Ready to merge, but no snapshot test and I have not manually tested it. Not sure if I'll have time to implement the snapshot test.

A basic end-to-end test is checking the stability distribution stats. If the distribution is consistent, the implementation should be correct.

L-M-Sherlock avatar Sep 19 '25 02:09 L-M-Sherlock

Oh, sorry... bitand is too fancy. && is enough.

            let mut rescheduler = (req.reschedule
                && self.get_config_bool(BoolKey::LoadBalancerEnabled))
            .then(|| Rescheduler::new(self))
            .transpose()?;

L-M-Sherlock avatar Sep 19 '25 08:09 L-M-Sherlock

Either someone manually tests this or we merge https://github.com/ankitects/anki/pull/4349 before this gets merged

DanielPechersky avatar Sep 19 '25 14:09 DanielPechersky

@dae please take a look at this PR as well

Expertium avatar Sep 27 '25 06:09 Expertium

Working towards it.

dae avatar Sep 27 '25 09:09 dae

Apologies for the delay, and thank you to everyone who's contributed.

assuming card.set_memory_state is much slower than everything else.

So, we might get a speedup by retrieving and updating cards in batches.

I recall testing this when I first moved Anki's DB access into Rust, and IIRC, there was a negligible benefit to batching reads/writes. Batching works well in Python because it reduces the overhead of C/Python context switching, but I'd be a bit surprised if batching sqlite calls makes much of a difference.

When you first proposed these changes, I would have liked to have seen a benchmark (manually run is fine) that quantified the before and after results, so we're not just guessing. From the commit history, it looks like Sherlock has since dug into it on the FSRS end and you've reverted the other batching changes, so I just mention this for the future. It would still be nice for the PR description to describe the current state of things, and what sort of improvement this has made.

A basic end-to-end test is checking the stability distribution stats. If the distribution is consistent, the implementation should be correct.

I'd rather we don't commit large test files into the repo, but we should at least be doing a one-time test to confirm the new code's output is functionally equivalent to the old code. Once the description describes the current state of things and you have tested the code is correct, assuming @L-M-Sherlock is happy with things, I will be happy to do a quick review of the code and merge this in.

dae avatar Oct 01 '25 10:10 dae

I have tested the consistency in my collection, and feel happy with it!

L-M-Sherlock avatar Oct 01 '25 12:10 L-M-Sherlock

@dae I have updated the description

DanielPechersky avatar Oct 01 '25 16:10 DanielPechersky

Fixed the merge conflicts, please review

DanielPechersky avatar Oct 01 '25 16:10 DanielPechersky

You have accidentally pulled other people's commits from the main branch into this PR.

user1823 avatar Oct 01 '25 16:10 user1823

I merged main in to fix merge conflicts. You can try https://stackoverflow.com/a/26986320/6211863 to see the diff, or I can rebase and remove the latest commit so someone else can fix the merge conflicts on merging to main.

DanielPechersky avatar Oct 02 '25 00:10 DanielPechersky

It would still be nice for the PR description to describe the current state of things, and what sort of improvement this has made.

Still would appreciate some actual numbers on the performance difference. :-)

dae avatar Oct 02 '25 06:10 dae

Still would appreciate some actual numbers on the performance difference. :-)

I copied these figures from discord:

25.09 took 10.182176084s

Sahl's PR took 969.745458ms

Sahl's PR plus my PR took 853.057208ms

The test collection has ~46k cards with ~760k reviews.

L-M-Sherlock avatar Oct 02 '25 07:10 L-M-Sherlock

That's an impressive improvement - well done to you both.

I've looked through the code briefly and it looks well written. My only remaining issue is with the smoke test - I don't think it's very useful at the moment, as you're running it on an empty collection. Adding at least one card and checking that one or more properties have been updated post-run as expected would be good. deckconfig/update.rs:updating() might be useful as a reference.

dae avatar Oct 02 '25 09:10 dae

extremely minor, and looks like it was present prior to your PR: [].as_slice() is more idiomatically &[]

dae avatar Oct 02 '25 09:10 dae