featurebase icon indicating copy to clipboard operation
featurebase copied to clipboard

add documentation about memory usage

Open jaffee opened this issue 6 years ago • 1 comments

When Pilosa has just freshly imported data, and is not serving any queries. It is possible to get a pretty accurate upper bound on its memory usage with a simple calculation.

The actual roaring bitmap data is mmapped and off-heap, so heap usage is dominated entirely by containers. See snippet of inuse_space profile below:

File: pilosa
Type: inuse_space
Showing nodes accounting for 23.73GB, 99.41% of 23.87GB total
Dropped 71 nodes (cum <= 0.12GB)
     flat  flat%   sum%        cum   cum%
  19.16GB 80.25% 80.25%    19.16GB 80.25%  github.com/pilosa/pilosa/roaring.NewContainer (inline)
   4.58GB 19.16% 99.41%     4.58GB 19.16%  github.com/pilosa/pilosa/enterprise/b.glob..func1

Multiply

  • of shards

  • number of time views
  • total # of rows in all fields
  • 16 containers per row
  • 80 bytes per container to get the total number of bytes. That will be about 80% of total memory use - (the actual Container structs). The rest is tracking of which container has which key.

Need to also talk about how queries and row cache affect memory, mmapped data should get evicted, but still affects total memory usage.

We can also see that cutting the size of the Container struct would be pretty worthwhile.

jaffee avatar Dec 20 '18 04:12 jaffee

set field caches and key translation also need to be accounted for

jaffee avatar Dec 31 '18 23:12 jaffee