Nested Set woes
Drive currently uses Nested Sets shipped from FF, although purely cosmetically. Internally, for almost all the operations Drive still maintains an Adjacency List by always pointing to a parent entity. The handful of times drive uses nested set methods can and mostly will be easily replaced.
All of this for the most part "functions" although the correctness of that statement is up for debate. Since a nested set in FF represents only one doctype (table for the uninitiated). When we create a drive_entity it exists within the context of the entire table. So the nested set is universal for all users.
If users A through Z exist. User Z uploading a file results in updating all the rows in
Drive Entityto update the correspondinglftandrgtgraph values.
Problems:
- Well, all the writes for one which isn't that big of an issue for other nested set implementations but for the case of drive an application that could be classified as "write-heavy".
- Moving subtrees in nested sets, only a couple of cases where we need such an operation (User A deletes a shared folder containing files from other users, we'd need to change the root of said files to the respective owner).
Potential solutions (in order of preference):
-
Stick to the adjacency list model and go about interfacing with it using Recursive Common Table Expressions, seems to be the goto solution over nested sets ever since DBs started supporting recursion. Pypika doesn't seem to support this so would involve writing raw SQL.
-
Figure out a way to separate out nested sets, so 1 nested set per user. This effectively also solves the instance wide writes. Although this breaks down the model of FF quite a bit. Maybe, or at least I haven't thought of a good enough way to make this work without massive side effects.
-
Leave it alone, let's wait and see if the extra db writes prove to be an actual issue.
If changing the implementation makes operations like delete folder, move files to another folder, etc easier then it makes sense to do it. Also, core changes like these should happen as early as possible since Drive doesn't have many users yet. Also, I would avoid fancy solutions unless they provide a huge advantage.