typedb
typedb copied to clipboard
Ordering - Use Cases
Problem to Solve
Ordering in TypeDB is currently far from optimized.
There had been a lot of discussions about how it can be solved but this issue comes to address a more primal question of what are the concrete use cases for ordering.
A suggestion on how to use this issue is to write the use cases in the comments and add a thump up if a use case is similar to yours.
If there are more similar Use Cases issues there should be a label for them. While opening a related issue it can be linked to the use cases and maybe having a "semi-dialog" in it about what are the use cases it about to solve and what is out of scope.
To keep the issue context and in case it will have value, here are some links from the how discussion: https://discord.com/channels/665254494820368395/680460517751521356/858228189159227392 https://discord.com/channels/665254494820368395/680460517751521356/859064975804334112
My use cases are the following (ordered by priority from my perspective 😀 ):
- I want to be able to fetch objects "in pages" (e.g. 20-40) from a set of objects with an ordering key (e.g., last modification epoch time).
- I have ToDo items entities that the user or the system creates and the user can prioritize them and putting them upper or lower in his view. I need this custom order to be kept. Some of the ToDo items are shared across multiple users and each can prioritize it differently.
- I have in my roadmap a document entity which is built from many entities such as paragraphs, bullets, ... (it is important to me to break it down into independent entities for the next steps and for relations) . Obviously, I need to be able to keep these entities sorted and in a constant order to keep the document to be comprehensible. A document owner can add content in between two existing paragraphs, move paragraphs around, and delete one. A paragraph can be a part of multiple documents and in each, it is in a different position. Think about this block of text and its parts and that multiple users can prioritize/rank (some of) its parts differently.
A document was written by @modeller TypeQL_-_List_Behaviour.docx
These requirements appear to be list-based, assuming an ordered list that can contain any thing (entity, relation, attrubute). thus the requirements above, can be combined with the vector requirements from #6327
The use cases you've described above can already be achieved with the sort
, offset
, and limit
functionality described here.
However I agree that they're not fast and should be optimised and I've triaged this issue onto our list of priorities.
Hello sharing with you an extremely simple example:
Let's say I have 1M ordered nodes (for example ordered with an index in the relation).
If I want to add a new one as the first one... I have to modify a million relations. Just to add one
Alternative:
This is easier to update (just modify the ones before and after) but it's impossible to query them properly.
Workaround:
- Adding a dictionary (index<>Id) at the parent level is the only way i've found to do this properly:
Having this natively would be such a nice feature!
Note:
A list that is updatable and indexable is a problem in most systems: consider a LinkedList
(O(1) update, but O(n) indexed read) vs an ArrayList (O(n) update, but O(1) indexed read).
You can implement a LinkedList in TypeDB with linked relations. An ArrayList can be emulated with an indexed set.
You can do somethink funky where you use a double to index the elements, so that insertion can just choose a double halfway between two existing values, giving you more complex lookups (probably a binary search?) but O(1) inserts. This will work until the doubles run out of precision...
Found an implementation of what you explained. Will implement that in the client side but I think it could be a nice way to have built-in vectors in typeDB, This is something that would help to differentiate against other graphs DBs that don't have linkedlists. And it does not feel like a rare use case.
https://github.com/brendon/ranked-model/blob/ac4079ebeda2a10b79fe6a488ddff45b0c5a5c46/lib/ranked-model/ranker.rb
Heads up: sorting on non-string values (and also not using reasoner) is now optimised natively in TypeDB :)
Just to make sure I understand: So the double option is the best for now? And with skip list (as predefined ranges) or the like as optimization. With the coming arithmetic I will be able to push them around and re-spread if there is a need
On Tue, 10 Jan 2023, 16:47 Joshua Send, @.***> wrote:
Heads up: sorting on non-string values (and also not using reasoner) is now optimised natively in TypeDB :)
— Reply to this email directly, view it on GitHub https://github.com/vaticle/typedb/issues/6390#issuecomment-1377389565, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAIPXPOP4KT36YR6LWN6QPDWRVZBJANCNFSM47P6AOWQ . You are receiving this because you authored the thread.Message ID: @.***>
yeah anything to do with ordering will be best optimised with sorted numbers
What about edge ordering (vectors)? Is there an ETA? Would it be included in the rust refacto? 🤔
What does "edge ordering" mean, @lveillard ? The only thing that could be sorted are things with data value, i.e. attributes.
Hello @haikalpribadi, I'm talking about ordering on-write, not on query, so probably not the right thread :S
Like storing arrays of values: $b has codes [1,2,8,4] for instance, in that particular order
or storing vectors of edges: $a (parent: $p, children: [$1, $2, $3, $4]);