[RFC] Files Recommendations: Improve algorithm and data sources
Goals
I think we can improve the current state. Right now we support three providers:
- Recently Commented
- Recently Edited
- Recently Shared
Sources suggestions
- Recently Accessed
- Recently Uploaded (because we keep the mtime when uploading, so new files might not appear as Recently Edited)
- Frequently Accessed
- Time of day: Accessed files during specific time of days ?
Aggregation bonus
- Some data could be regrouped too. Like if you frequently access more than x files in the same folder, also recommend the folder directly?
Distribution
I also suggest to add a weight combining score Something that can be fined tuned over time, but we could weight the relevancy of a result if it matches multiple criterias (dumb values used for this example below)
$score = (
0.3 * $recencyScore +
0.2 * $frequencyScore +
0.1 * $editScore +
0.1 * $sharedScore +
0.1 * $commentedScore
);
There was already https://github.com/nextcloud/recommendations/issues/856 which lists a bunch of issues with ideas for improvements.
The biggest issue I have with recommendations right now is that the files are usually not relevant for me at all. E.g. if you check your recommended files on our company instance, it is likely to be marketing files.
What would be nice if we can consider or prefer things like:
- Folders/files which are your favorites
- Folders/files which you often open and use
- Folders/files which your closest colleagues use and modify
- Somehow factoring in other things, like e.g. currently I consider the conference-related documents important (the ones Eirini shared for us to fill in)
- Probably more
This would help to limit the amount of files we are looking at, and thus help show you more relevant ones.
FYI @kra-mo if you have further ideas regarding this. :)
I was currently working on Frequently Accessed.
I guess adding the favorite would not be that hard
And if we do the last point of my post, it should help us adjust which source is most important :)
- Finish
Frequently Accessed - Add Favourites
- Implement weighted source float number
- Test and adjust to see what make sense best
@skjnldsv any news regarding this by the way, or something that could need review, or where you need input? :)
@skjnldsv any news regarding this by the way, or something that could need review, or where you need input? :)
It's been postponed and I'm on vacations :)