slatedb: Ids Proposal (encoded for ASC or DESC `created_at` order, with limit and offset)
Proposal
In Snowflake (you can see the id is listed in this with description: https://docs.snowflake.com/en/sql-reference/account-usage/query_history) and pretty much all databases, the ids were always just numbers like in an array. To use this will need a full TableProvider or any way of passing order by, order direction, offset & limit to the DF Views.
This will work for the default usage, so when a person just opens the search page everything will need to be sorted by created_at. Timestamps as ids are not durable enough. But indicies (with atomic counters) are, 0 indice is the first created and etc.
So why would this work for ASC and DESC imagine we have 10 objects from 0 to 9. This means that for ASC if we want to offset and limit we would say give me from offset..offset+limit. But if we want for DESC we would say I want from counter - (limit + offset)..counter - offset, since we can't seek backwards, but it doesn't matter we just want to get the right objets and then we can flip them in the query engine.
Consider this [0, 1, 2, 3, 4, 5, 6, 7, 8, 9], counter = 10 and we want to get [6, 7]:
- In ASC order what would we give to the query engine as offset and limit? 6 and 2 which would result in 6..6+2 = 6..8 (excluded in slatedb) and we would get [6, 7].
- In DESC order what would we give to the query engine as offset and limit? 2 and 2, which would result in 10 - (2 + 2)..10 - 2 = 6..8 (again exluced in slatedb) and we would get [6, 7], flip this in query engine.
Now lets's asy we want [0, 1, 2, 3, 4, 5, 6, 7, 8], counter = 9 and we want to get [4, 5, 6]:
- In ASC order: offset = 4, limit = 3, so we get 4..4+3 =4..7 (excluded) => [4, 5, 6].
- In DESC order: offset = 2, limit = 3, so we get 9 - (2 + 3)..9 - 2= 4..7 (excluded) => [4, 5, 6], flip this in query engine.
THIS DOESN'T WORK with single deletion. There may be a formula, but it's hard to calculate it out, with included deletion. ~So better and easier to just take the whole range, skip the offset, and then take until the limit.~
Impl details
- u64 as id
- AtomicU64 as counter
- We had a problem with names where for example -
mydb1,mydb2&mydb10were in this order actually:mydb1,mydb10&mydb2. There are possibly two ways to solve this:- First, just add one
NULchar (the smallest ascii symbol) to the end of the id. - Second, (if the first approach is insufficient), will have to use this forumla: u64::MAX % 10 (const, also a ocuple of times) - id % 10 (couple of times, until we get zero) = how much
NULsymbols we should add after the id, to make it order correctly.
- First, just add one
- This should allows us to keep the real order:
qh/0NUL(NUL)
Pros
- Lexicographically sortable
- For created_at we encode offset, limit and ASC, DESC which is the default use case (or atleast should be)
Cons
- We need to store a counter in each db for schemas, and we need to store a counter in each schema for tables
- Deletion, I think this is the biggest problem, this may break the math (it may not because of the counter, and the nature of the slate lexicographical search, but will need to have the length and the counter separately perhaps).
Attachment
Snowflake uses IDENTIFIER(value) where value maybe the name of the table/schema/db or a session name given for this session to table/schema/db. You can have multiple name to the same object: https://docs.snowflake.com/en/sql-reference/identifier-literal
Regardless of the type of id it would be beneficial for us to store everything in query engine by id and not by name. Because when will add this support, will have to do things like: session_name -> id, id -> name in query engine. Instead we could have any_name -> id and query by id always. For impl will have to use a Visitor and check a map for the name (it can be done in memory no need to add an index for this in slatedb, even for the original name, but for original name it can be don as an index if want to). When will load we would add the original name in the map from the slatedb objects.
@rampage644 @osipovartem @YaroslavLitvinov what do you think?
Ok, so I see proposal mostly about replacing timestamp based id by id based on atomic counter . That can work in case our history list is only grows and items will not be deleted. As in case if item removed and it's in range you iterating it will return less items than specified in limit.
Ok, so I see proposal mostly about replacing timestamp based id by id based on atomic counter . That can work in case our history list is only grows and items will not be deleted. As in case if item removed and it's in range you iterating it will return less items than specified in limit.
-
Query history shouldn't have deletion per record, maybe only as a whole batch.
-
I'm talking about all ids for all objects we store in SlateDb, since all of them need to be returned by timestamp as a default use case.
-
I'm rechecking the math with deletion (perhaps I made a mistake with -1 or + 1), it would still work. We just need to keep the length and last element id separately.
There is an alternative approach which is easier to implement (I think there is a formula, but it's hard to calc). We don't really need to build the range this way. Since ids are ordered, we can just take the whole range and skip the offset elements and then take the limit elements after it. No, we don't put all the elements in memory, check the .next() impl of the SlateDb iterator. Each time we don't take the next element into our Vec, we just discard its memory. This also allows us to keep just one counter (no atomic length). Updated the main proposal.
The one thing this changes, or may not change, as from my calculation the DESC order shouldn't be affected by deletion, at least the examples I wrote, (and other I didn't write up)
I think the formula for DESC is always true but one change I made is I start from 1 not from 0.
(length + 1) - (offset + limit)..(Length + 1) - offset
Consider offset = 2, limit = 2 with DESC:
-
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10] last = 10, length = 10, answer should be [7, 8] (10 + 1) - (2 + 2)..(10 + 1) - 2 = 7..9 (excluded)
-
[1, 2, 3, 4, 5, 6, 7, 8, 10] last = 10, length = 9, answer should be [6, 7] (9 + 1) - (2 + 2)..(9 + 1) - 2 = 6..8 (excluded)
-
[1, 2, 3, 4, 5, 6, 7, 10] last = 10, length = 8, answer should be [5, 6] (8 + 1) - (2 + 2)..(8 + 1) - 2 = 5..7 (excluded)
-
[1, 4, 5, 6, 7, 10] last = 10, length = 6, answer should be [5, 6] (6 + 1) - (2 + 2)..(6+1) - 2 = 3..5 NOPE not right
-
I guess skipping and taking is easier :(