featurebase
featurebase copied to clipboard
Result columns limit offset support
Description
Some query result may have very big columns, but I just want a slice of the columns. Of course I can do it by hand, but it is not memory efficient , because the result takes a lot of memory. Can pilosa support limit / offset just like sql, return just the exact slice that i want. In other words, what I want is paging?
Success criteria (What criteria will consider this ticket closeable?)
limit / offset just like sql.
the query endpoint takes a shards parameter with which you can emulate some kind of paging.
e.g.
.../query?shards=0,1,2,3
@jaffee Sorry for the late reply, I know shards, but it is not what I want. I want a slice of the result columns queried from the whole index. I know this is not a typical situation for pilosa, after all pilosa is not a database, but still thanks.
if you only pass one shard at a time, you'll get at most 1M column results back at a time, so you can "page" through all the columns one shard at a time
@jaffee Is there a API that tell me how many shards do I have, and what are their numbers? thanks
shards are based off of column IDs, so take any column ID and divide by the shard width (2^20) and you get what shard it is in. There is an undocumented /internal/shards/max endpoint which returns the max shard for each index.
On Thu, Jan 16, 2020 at 8:54 PM young118 [email protected] wrote:
@jaffee https://github.com/jaffee Is there a API that tell me how many shards do I have, and what are their numbers? thanks
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/pilosa/pilosa/issues/2092?email_source=notifications&email_token=AAHCC42RU5QEKRGLD7DWFLLQ6EMUPA5CNFSM4J47LN5KYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEJGITZY#issuecomment-575441383, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAHCC4677ACV4IMDBLAB2UDQ6EMUPANCNFSM4J47LN5A .
@young118 At the API level, there's a method called AvailableShardsByIndex()
which returns a map of index to available shards (as a roaring bitmap).
It doesn't appear that this API is surfaced in an HTTP endpoint; the only thing available there is GET /internal/shards/max
, which returns a map of index to MAX shard. So using that, the assumption is that you have data in shards 0 - max shard, and you can then page over that range. Obviously that's not always ideal; we need to provide an http endpoint which supports AvailableShardsByIndex
.
@jaffee thanks, that's a good idea
@travisturner thanks, helpful