cyanite
cyanite copied to clipboard
Select paths individually in parallel rather than using "in clause"
our cyanite stack performs substantially better if we request paths individually rather than in bulk. specifically, if we request 100 metrics in a single select query, it takes roughly 10 times longer than selecting those same 100 metrics one at a time. this seems to be a problem with heap space on the coordinating cassandra node. as such, i suggest we change the fetch query to retrieve one metric at a time, then sort the returned rows by time within cyanite. that way there is substantially less heap pressure on cassandra, and only slightly more cpu load on cyanite.
i have branched the latest 0.1.3 version in order to make the change. my revision can be found here: https://github.com/tjamesturner/cyanite/commit/0ba66249fd9ace7ff92266a0df02828c55e67b0f
Thanks @tjamesturner. Interesting approach, I'll take on this once the query API is updated.