Speed up update_memory_state by batching FSRS memory state calculation
Break down update_memory_state into three cases:
- Clear FSRS data for cards
- Update memory state for cards without items
- Update memory state for cards with items
Use fsrs.memory_state_batch over card.set_memory_state when handling case 3 for performance gains.
TODO: I don't update progress correctly
assuming card.set_memory_state is much slower than everything else.
Just a guess, but I think retrieving and updating each card individually might be the slowest steps. So, we might get a speedup by retrieving and updating cards in batches. Trying to process all cards at once may cause out of memory issues though, if the user has millions of cards.
let mut card = self.storage.get_card(card_id)?.or_not_found(card_id)?;
self.update_card_inner(&mut card, original, usn)?;
PS: I am not a developer.
@user1823 true, I got the impression off the discord that the FSRS code is the "known criminal" here, but there's no benchmark so the function could be slow for any reason, including IO reasons. Obviously batching IO is a good idea too. Btw 10,000,000 cards = ~~351 MB~~3 GB. ~~Not much~~Substantial, but the workaround is simple: Batch cards while processing them. I find it unlikely that the memory usage will be a problem though.
My change will allocate an extra 3.51 MB of scratch space for every 10,000 cards the function processes.
Btw 10,000,000 cards = 351 MB
If 10k cards is 3.51 MB, then 1000 more than that should be 3.51 GB, no?
@Expertium Truly, my zero counting skills are legendary. You're right
Can someone check whether the slowest part is FSRS and not fetching the card's review history from wherever Anki stores it?
Commit 27f3ecf8651ff63746b2d8cc0b0e4caf2672d491 is a basic implementation that uses the new FSRS function memory_state_batch.
Commit 74665ef2a8548bbdb5c8386d9f2569ff65de5239 expands on it and batches items (to prevent using too much memory) as well as batching the inputs to FSRS for potential efficiency.
When I were saving the preset, this error occurred:
Anki 25.09 (c5036e20) (ao)
Python 3.11.6 Qt 6.9.1 PyQt 6.9.1
Platform: macOS-14.5-arm64-arm-64bit
Traceback (most recent call last):
File "/Users/jarrettye/Codes/anki/qt/aqt/progress.py", line 198, in on_progress
self.update(label=update.label, value=update.value, max=update.max)
File "/Users/jarrettye/Codes/anki/qt/aqt/progress.py", line 228, in update
self._win.form.progressBar.setValue(self._counter)
OverflowError: argument 1 overflowed: value must be in the range -2147483648 to 2147483647
It's nice to combine the progress. Here is my modification:
--- a/rslib/src/scheduler/fsrs/memory_state.rs
+++ b/rslib/src/scheduler/fsrs/memory_state.rs
@@ -17,2 +17,3 @@ use crate::card::CardType;
use crate::prelude::*;
+use crate::progress::ThrottlingProgressHandler;
use crate::revlog::RevlogEntry;
@@ -67,3 +68,3 @@ impl<T> ChunkIntoVecs<T> for Vec<T> {
std::iter::from_fn(move || {
- (!self.is_empty()).then(|| self.split_off(chunk_size.min(self.len())))
+ (!self.is_empty()).then(|| self.drain(..chunk_size.min(self.len())).collect())
})
@@ -118,2 +119,5 @@ impl Collection {
+ let mut progress = self.new_progress_handler::<ComputeMemoryProgress>();
+ progress.update(false, |s| s.total_cards = items.len() as u32)?;
+
let (items, cards_without_items): (Vec<(CardId, FsrsItemForMemoryState)>, Vec<CardId>) =
@@ -148,2 +152,3 @@ impl Collection {
usn,
+ &mut progress,
)?;
@@ -230,2 +235,3 @@ impl Collection {
usn,
+ &mut progress,
)?;
@@ -257,5 +263,4 @@ impl Collection {
usn: Usn,
+ progress: &mut ThrottlingProgressHandler<ComputeMemoryProgress>,
) -> Result<()> {
- let mut progress = self.new_progress_handler::<ComputeMemoryProgress>();
- progress.update(false, |s| s.total_cards = cards.len() as u32)?;
for (idx, card_id) in cards.into_iter().enumerate() {
@@ -278,2 +283,3 @@ impl Collection {
usn: Usn,
+ progress: &mut ThrottlingProgressHandler<ComputeMemoryProgress>,
) -> Result<()> {
@@ -285,5 +291,2 @@ impl Collection {
let mut starting_states = Vec::new();
-
- let mut progress = self.new_progress_handler::<ComputeMemoryProgress>();
- progress.update(false, |s| s.total_cards = items.len() as u32)?;
for (i, items) in items.chunk_into_vecs(ITEM_CHUNK_SIZE).enumerate() {
Looks good, feel free to commit that
I'm not a maintainer of this repo, so I cannot commit it to this PR.
Here is the samply record: https://share.firefox.dev/41TslxP
Do not merge: waiting on fsrs-rs 5.2.0
Do not merge: waiting on fsrs-rs 5.2.0
It was published yesterday: https://crates.io/crates/fsrs/5.2.0
Ready to merge, but no snapshot test and I have not manually tested it. Not sure if I'll have time to implement the snapshot test.
Ready to merge, but no snapshot test and I have not manually tested it. Not sure if I'll have time to implement the snapshot test.
A basic end-to-end test is checking the stability distribution stats. If the distribution is consistent, the implementation should be correct.
Oh, sorry... bitand is too fancy. && is enough.
let mut rescheduler = (req.reschedule
&& self.get_config_bool(BoolKey::LoadBalancerEnabled))
.then(|| Rescheduler::new(self))
.transpose()?;
Either someone manually tests this or we merge https://github.com/ankitects/anki/pull/4349 before this gets merged
@dae please take a look at this PR as well
Working towards it.
Apologies for the delay, and thank you to everyone who's contributed.
assuming card.set_memory_state is much slower than everything else.
So, we might get a speedup by retrieving and updating cards in batches.
I recall testing this when I first moved Anki's DB access into Rust, and IIRC, there was a negligible benefit to batching reads/writes. Batching works well in Python because it reduces the overhead of C/Python context switching, but I'd be a bit surprised if batching sqlite calls makes much of a difference.
When you first proposed these changes, I would have liked to have seen a benchmark (manually run is fine) that quantified the before and after results, so we're not just guessing. From the commit history, it looks like Sherlock has since dug into it on the FSRS end and you've reverted the other batching changes, so I just mention this for the future. It would still be nice for the PR description to describe the current state of things, and what sort of improvement this has made.
A basic end-to-end test is checking the stability distribution stats. If the distribution is consistent, the implementation should be correct.
I'd rather we don't commit large test files into the repo, but we should at least be doing a one-time test to confirm the new code's output is functionally equivalent to the old code. Once the description describes the current state of things and you have tested the code is correct, assuming @L-M-Sherlock is happy with things, I will be happy to do a quick review of the code and merge this in.
I have tested the consistency in my collection, and feel happy with it!
@dae I have updated the description
Fixed the merge conflicts, please review
You have accidentally pulled other people's commits from the main branch into this PR.
I merged main in to fix merge conflicts. You can try https://stackoverflow.com/a/26986320/6211863 to see the diff, or I can rebase and remove the latest commit so someone else can fix the merge conflicts on merging to main.
It would still be nice for the PR description to describe the current state of things, and what sort of improvement this has made.
Still would appreciate some actual numbers on the performance difference. :-)
Still would appreciate some actual numbers on the performance difference. :-)
I copied these figures from discord:
25.09 took 10.182176084s
Sahl's PR took 969.745458ms
Sahl's PR plus my PR took 853.057208ms
The test collection has ~46k cards with ~760k reviews.
That's an impressive improvement - well done to you both.
I've looked through the code briefly and it looks well written. My only remaining issue is with the smoke test - I don't think it's very useful at the moment, as you're running it on an empty collection. Adding at least one card and checking that one or more properties have been updated post-run as expected would be good. deckconfig/update.rs:updating() might be useful as a reference.
extremely minor, and looks like it was present prior to your PR: [].as_slice() is more idiomatically &[]