Make FSRS the default?
In the next non-trivial (not 24.11.x) update, I think it's about time we enable FSRS out of the box. Any objections?
Any objections?
Yes. Let's not make FSRS the default before automatic optimization. Realistically, how many users do you expect to click "Optimize" at least once in their lifetime? I'd say 50% at best, likely less. And how many users will click "Optimize" multiple times? 10%? 5?%
Right now it's mostly power users and tech-savvy people that are using FSRS, so they know that optimization should be done regularly. An average user who is using Anki with out of the box settings won't realize that optimization has to be done at all. For a power user, automatic optimization saves 2 seconds of clicking "Optimize". For an average user, it makes the difference between using the default parameters and the personalized parameters.
My point: FSRS with default parameters is better than SM-2 in 91.9% cases.
source: https://github.com/open-spaced-repetition/srs-benchmark/blob/main/plots/Superiority-9999.png
A CONFLICT OF INTEREST: I'm the main developer of FSRS, please disregard my opinion.
And FSRS-5 with optimized parameters is better in 99.0% of cases. In % it may not seem like a big difference (91.9% vs 99.9%), but in terms of odds, it's an improvement from "1 user out of 12 would be better off using SM-2 than FSRS" to "1 user out of 100 would be better off using SM-2 than FSRS".
And FSRS-5 with optimized parameters is better in 99.0% of cases.
Well, no one here is saying that automatic optimization won't be implemented.
But, until we can develop AO, I think that it's reasonable to provide users with something that is better than what they currently have, even if it is not the best.
Also, let's stop discussing about AO now because arguments from both sides have been made and it's dae who has to take the decision now.
If you (or someone else) has any other objection, please feel free to discuss.
FSRS-5 with AO is better than FSRS-5 with default parameters in 80.5%, which is less than 91.9%.
So the improvement from AO is less than improvement from FSRS-5.
@Expertium you have contributed a large amount of time and effort both into suggesting improvements to FSRS, and advising users on its correct usage, including very comprehensive posts like https://old.reddit.com/r/Anki/comments/1h2otym/anki_2411_one_of_the_biggest_updates_ever/. They have been noticed, and I really appreciate all the work you have put in.
That said, this is the second time you've attempted to delay an improvement until it's perfect. I don't think that's the best approach - I think it's better that we get these improvements into the hands of the bulk of users, and address any issues in the future.
Alright. Maybe make the mythical optimization reminder (I have never seen it myself) more frequent? I've heard that it only shows up for people who have used "Optimize all presets", though. That would have to change.
I agree that FSRS should be set to default, a milestone change.
@dae how about a really radical solution - Optimize [all presets] right next to Sync? That way it's impossible to miss.
This + a pop-up that appears each time the number of reviews in the collection doubles (starting from, say, 100 reviews), which is a better rule than "every month". So the user will get a pop-up at 100 reviews, then at 200, then at 400, etc.
I'm neutral on this change, but I wanted to ask what "by default" means. For new users/profiles only? Or would it take effect for everyone as soon as we download the new version (whatever number it ends up being) unless I intentionally for myself change it back to SM2 + custom scheduling?
(I'm also wondering about Ankimobile, since that one often updates in the background without me noticing. If it's not just for new users/profiles and would go out to everyone, would I get an alert that the scheduler has changed?)
No forced transition. If there is a previous installation of Anki on your device, the settings of that installation will be kept. If no previous installations are found aka this is the first time you are installing Anki, FSRS will be enabled by default. The option to enable SM-2 will be there for backward compatibility reasons, it's not like SM-2 will be completely deleted.
That's how I imagine it, and I'm betting 100 bucks that's how it will be.
I would like to bring @dae's attention to these issues I feel need to get fixed.
-
https://github.com/open-spaced-repetition/fsrs4anki/issues/675#issue-2445611951: This is the issue where using SDD to set the due date for new cards doesn't affect the scheduling after that. Already a few people have reported this affecting their workflow so I think it needs to get fixed, preferably with a revert of db93939ded947d74bb25bd8552a2b2356a096509 as described here.
-
https://github.com/open-spaced-repetition/fsrs4anki/issues/708#issuecomment-2497612865: This one's new and only started happening after the FSRS-5 update. In brief, the interval that I get after relearning ends, often becomes higher than what it was before the card entered learn state. So, then the card ends up in a vicious cycle of fail and relearn.
I wouldn't expect 2) to be solved any time soon, tbh. And frankly, I'm much more interested in whether Dae will like this idea.
Wait, I completely forgot about the Hard problem.
https://forums.ankiweb.net/t/how-to-prevent-users-from-misusing-hard-ideas-are-welcome/49092/133?u=expertium
@dae nevermind, I am unironically, chronically, radically, medically, concretely, discreetly opposed to making FSRS the default algorithm until the UI is changed to make it clear that Hard is not "fail".
Remember, at least 10% of Anki users misuse Hard, and this is based on nerds from r/Anki. It's likely worse outside of r/Anki, since r/Anki is a place specifically for Anki enthusiasts.
Dae, here are some options:
- Do nothing. Cons: Users will continue misusing Hard and then complain that FSRS is terrible.
- Implement what I and others have suggested
Cons: people who like symmetry will hate it.
- Make 2 buttons the default, and leave an option to enable Hard and Easy in Preferences. Cons: it will make old documentation and old videos about Anki confusing.
- Make an interactive tutorial that helps new users, like this. Cons: it will require a loooooot of effort.
EDIT: Dae, please tell me that this issue is "I am considering making FSRS the default, but willing to postpone it if there are serious roadblocks" and NOT "I have already made up my mind, but feel free to shout into the void to get a false sense of involvement". Because right now I'm getting the vibes of the latter. Feel free to call me a dunderhead and tell me to be more respectful, but I'm not the only one who feels this way.
Automatic optimization (without option to disable it and manually tweak the weights) is a very bad idea for many reasons. FSRS will never be perfect to the point where you can rely on it without a doubt.
The biggest problem with FSRS is overfitting.
I will try to explain what I mean:
Let's say the user only uses the Again and Good buttons, and his usual pattern of answering (with optimized weights, let's call w_n) is something like:
1st day (New): 1, 3, 3, 3
3rd day: 3
8th day: 3
...
This user answers most of his new cards in such a pattern for quite a long time. The problem arises when the user decides to re-optimize the weights to w_(n+1) state. FSRS, analyzing his reviews, sees that if the first answer is 1, then the user will answer 3 and 3 for the 3rd and 8th day, so FSRS will increase w_0. initial stability (Again) in such a way that the user can immediately skip to the 7th day, for example.
But the point is that there is no evidence that this will result in the same "recall quality/memorization of this piece of information" as for the "1th, 3th, 8th day" case.
To summarize what I have written above: if there is a repeating good pattern of answers, then FSRS will optimize it, for fewer steps (this is the main principle for any optimization model). But very likely, that this good pattern was exactly because of extra steps.
I have exactly this situation, and because of it I have to manually tweak weights (mainly w_0) to get more realistic intervals.
With AO, such manipulations will not be possible.
Manual tweaking and the ability to leave a weights in one state, until the user wants to optimize them, should stay in Anki (as well as manual "learning/relearning steps"). For me, the right decision seems to be to add a toggle in the preset menu for AO on preset basis. Or at least, as Expertium suggested - "Optimize button [all presets] right next to Sync". These two solutions will satisfy all types of users - those who want to fully rely on FSRS and those who want some thoughtful control.
Regarding "Make FSRS default", I fully agree.
FSRS will increase w_0. initial stability (Again) in such a way that the user can immediately skip to the 7th day, for example.
To summarize what I have written above: if there is a repeating good pattern of answers, then FSRS will optimize it, for fewer steps (this is the main principle for any optimization model). But very likely, that this good pattern was exactly because of extra steps.
That's not how FSRS works. @L-M-Sherlock feel free to clear the misunderstanding.
FSRS will increase w_0. initial stability (Again) in such a way that the user can immediately skip to the 7th day, for example.
To summarize what I have written above: if there is a repeating good pattern of answers, then FSRS will optimize it, for fewer steps (this is the main principle for any optimization model). But very likely, that this good pattern was exactly because of extra steps.
That's not how FSRS works. @L-M-Sherlock feel free to clear the misunderstanding.
If you are so sure, please explain why my w_0 always increases after re-optimizing the weights? Even if I am not entirely correct in the FSRS algorithm, the result is the same - FSRS is overfitting, which results in always increasing w_0, hence increasing intervals.
Not a contributor. But i'd like to propose to the team an option to choose one of two models after first launch, with some simplie description. And with clarification that model could be changed in settings later. Instead of just changing years-long default, proven to attract a lot of users. Also making users more aware by that of different models and importance of some settings parameters. I guess, way more users would try both models and different settings after initial launch, having that banner.
"Before starting we'd like to ask. How do you prefer to learn.
- Experimental way, possibly spending less time.
- More stable and predictable way, sometimes having more reviews for a day. (Learning model could be changed in settings later)"
Not a contributor. But i'd like to propose to the team an option to choose one of two models after first launch, with some simplie description. And with clarification that model could be changed in settings later. Instead of just changing years-long default, proven to attract a lot of users. Also making users more aware by that of different models and importance of some settings parameters. I guess, way more users would try both models and different settings after initial launch, having that banner.
"Before starting we'd like to ask. How do you prefer to learn.
- Experimental way, possibly spending less time.
- More stable and predictable way, sometimes having more reviews for a day. (Learning model could be changed in settings later)"
Unfortunately, the Anki team chose the path of "Let's simplify everything by taking away the user's right to customize the app/theirs workflow and remove 'unnecessary' options" some time ago (if you want, you can read the PR for Load Balancer - https://github.com/ankitects/anki/pull/3230/). So from now on there will never be a banner that lets the user decide something for himself because it is "too difficult for the average user".
Set FSRS as the default, with 4 caveats (✨)
Caveats (TL;DR)
- We should aim for automatic optimization in future.
- ~~Delve into the comparison between
FSRS-5-unoptimizedandSM-2to be sure we're not picking up pennies in front of a steamroller. The users in the 8.1% may have a scheduler with catastrophically worse probability of recall, and FSRS may be marginally better at prediction (unlikely, but let's use the data).~~ - A new user to Anki under FSRS needs to be warned if misuse of Hard is detected. I'd be happy with something as simple as repurposing the old 'too many decks can slow down your collection' warning on the main screen if hard misuse is detected/suspected.
- (nice to have) - simplify FSRS settings. Most of the settings pane can be hidden behind either an 'advanced' option, or prompted when FSRS is enabled for the first time, then hidden.
- We should not provide a choice of algorithm during onboarding:
- FSRS is a sensible default, and it should be a very advanced option to move to a (likely) worse scheduler. There is huge cognitive load in selecting an algorithm, and the onboarding user experience would be awful.
- We should not force a transition to FSRS.
Automatic Optimization
Result: Obvious win for FSRS (better results for 91.9% of users).
✨ Caveat 1: We should aim for automatic optimization in future.
✨ Caveat 2: Delve into the comparison between FSRS-5-unoptimized and SM-2 to be sure we're not picking up pennies in front of a steamroller. The users in the 8.1% may have a scheduler with catastrophically worse probability of recall, and FSRS may be marginally better at prediction (unlikely, but let's use the data).
Treating FSRS without optimization as a separate scheduler (FSRS-5-unoptimized) the question is: "Do we make FSRS-5-unoptimized the default for new users".
Table: % of collections in the benchmark where Algorithm A (row) estimates the probability of recall more accurately than Algorithm B (column). source
| FSRS-5 | FSRS-5-unoptimized | SM-2 | |
|---|---|---|---|
| FSRS-5 | - | 80.5% | 99.0% |
| FSRS-5-unoptimized | 19.5% | - | 91.9% |
| SM-2 | 1.0% | 8.1% | - |
Inverting the question: if we were on FSRS-5-unoptimized, would we move back to SM-2? Obviously not.
Misuse of 'Hard'
Blocker (IMO), but I feel this can quickly be resolved with a warning, and improved further with future UX/onboarding efforts.
✨ Caveat 3: A new user to Anki under FSRS needs to be warned if misuse of Hard is detected. I'd be happy with something as simple as repurposing the old 'too many decks can slow down your collection' warning on the main screen if hard misuse is detected/suspected. Note: only an example; I don't have skin in the game regarding implementation.
Reduction in Settings/User Control
✨ Caveat 4: (nice to have) - simplify FSRS settings. Most of the settings pane can be hidden behind either an 'advanced' option, or prompted when FSRS is enabled for the first time, then hidden.
Note: This affects power users, regular users are less likely to change settings in general.
Net positive: too many advanced settings is intimidating and increases cognitive load. FSRS provides one lever for the user to pull (desired retention). Education (visually, in the deck options) around why 100% retention isn't ideal could do with improvement, but it's 'good enough'.
The first question a large number of users have is "[I have an exam in X] what settings should I use?". SM-2s options offer numerous opportunities for a user to make mistakes (especially with how unintuitive spaced repetition is to a new user). Options in FSRS are a 'pit of success', and having 1 option to understand is easier than needing to understand intervals, graduation, the answer buttons, steps, lapses etc...
If a power user wants control, they have the option to move back to a specialized algorithm.
@Expertium
Remember, at least 10% of Anki users misuse Hard, and this is based on nerds from r/Anki. It's likely worse outside of r/Anki, since r/Anki is a place specifically for Anki enthusiasts.
I've seen you say that before, but it seems to me it could just as easily be better outside of Anki-enthusiast communities. [That's not even considering the anti-authoritarian/"I just want to watch the world burn" bent that seems to be more common on Reddit than other parts of the internet. 😅 A certain number of your 18 misusers might have simply been lying. ]
Don't the main reasons for Hard-misuse spring from Anki-guru-fostered and Anki-enthusiast-propagated ideas about how you can "game" the scheduling algorithm? You know, all the same stuff that has caused folks to show up asking for help for the first time with SM-2 settings that are unhinged from reality?
I think average-Jane Anki user when faced with the 4 buttons, and no outside knowledge (for good or for ill) would be more likely to look at Again-Hard-Good-Easy and logically analyze them --
Hard and Easy are opposites and I (instantly, intuitively) understand what it means to say "it was hard" or "it was easy." Since they are balanced on either side of Good, that must be in between them, like saying "it was good enough." Again is outside of that series, but it can't mean "it was again," because that doesn't make any sense. So it must mean "I want to see it again." It looks like it shows me the card again in a few minutes so I can make sure I remember it? Okay, I'll use that when I get it wrong. [[end scene]]
I acknowledge that I have no more support for my position than you do for yours ... but that's pretty much my point. Unless you have a survey of a randomized sample of the the "I've been using Anki for 5 years and I just found out about the manual/forum/subreddit/discord/YTers/etc today" contingent, the results will always be useful-but-pliable. No one should die on the hill of protecting that ephemeral 10% of users.
And: a strong +1 to @david-allison 's point about measuring how-much-worse it is for the 8.1% against how-much-better it is for the 91.9%. Data, data, data.
I wouldn't expect 2) to be solved any time soon, tbh.
I have a solution for that: freeze the stability during same-day reviews if the user wants. It could be implemented in FSRS-rs.
And: a strong +1 to david-allison's point about measuring how-much-worse it is for the 8.1% against how-much-better it is for the 91.9%. Data, data, data.
Here's a breakdown of the scheduler comparison data, for someone to create a decent looking histogram [warning: heavy page]: https://gist.github.com/david-allison/a623d76654e216478d107655bbb5b2dd
See: https://github.com/ankitects/anki/issues/3616#issuecomment-2525573417 for charts
A new user to Anki under FSRS needs to be warned if misuse of Hard is detected.
Sadly, me and Jarrett couldn't think of a good way to detect it.
I have a solution for that: freeze the stability during same-day reviews if the user wants. It could be implemented in FSRS-rs.
So the user would have to decide on their own? That's not a good idea. Most users won't be aware of problems with formulas.
What Jarrett said in discord:
This issue is involved in three factors:
- the ratio: post-lapse stability / last stability
- the impact of the same reviews: w[17] and w[18]
- the number of relearning steps
We can automatically detect this issue via taking above factors into account.
That's a different issue. I wasn't talking about short-term S, I was talking about misusing Hard. We can't detect it automatically.
Yeah, I was saying we might detect that one automatically and have FSRS automatically freeze S.
@L-M-Sherlock what about my old idea here? https://forums.ankiweb.net/t/how-to-prevent-users-from-misusing-hard-ideas-are-welcome/49092/61?u=expertium
Just do 3 checks:
Check if w_15 (a parameter in the parameters field) is <0.01 Check if RMSE is >5.0% Check if the number of reviews is >1000, just to avoid false positives If all 3 are true, display the following pop-up message:
There is a possibility that you are using the Hard button incorrectly. When you press Hard, Anki assumes that you have successfully recalled the card. Please keep in mind that Hard is not “fail”, it’s “pass”.
It’s not 100% reliable, but it should be reliable enough. And it’s as simple as it gets: three else-if statements and a pop-up. Doesn’t get much simpler than this.
Even if it's not super reliable, it's better than nothing. And the cost of a false positive is small: just mildly annoying the user once.
Alright, here are 2 charts
SM-2 vs FSRS-5 with default parameters:
SM-2 vs FSRS-5 with optimized parameters:
Obvious caveat: SM-2 wasn't designed to predict probabilities, and the only reason it does so in the benchmark is because Jarrett added extra formulas on top of it.
Actually, let's compare them under the most generous (for SM-2) assumptions possible.
- We hooked SM-2 up to the same optimizer that is used by FSRS-5
- We forgot to optimize FSRS-5
Even under these assumptions, FSRS-5 still outperforms SM-2 in 85.7% of cases.