m3
m3 copied to clipboard
R2 Server "Rollup" Feature request/help.
We have some very high cardinality data that we are representing as an undirected graph.
In order to reduce the m3 burden, we emit edges of the graph, using "lo" and "hi" tags, so that we drastically reduce the cardinality in the system.
For example, with the following graph:
A
/ \
4 / \ 5
/ \
B-------C
7
We would emit:
node-lo:A node-hi:B => 4
node-lo:A node-hi:C => 5
node-lo:B node-hi:C => 7
Note, and this is the problem, that B appears as both low and high in the graph.
As well as the edges, we want to calculate per node "rollups". The way we are currently doing this is:
fLO = exec(fetch name:failures node-lo:* ... | mapKey node-lo node);
fHI = exec(fetch name:failures node-hi:* ... | mapKey node-hi node);
sLO = exec(fetch name:successes node-lo:* ... | mapKey node-lo node);
sHI = exec(fetch name:successes node-hi:* ... | mapKey node-hi node);
f = exec(fLO | fHI);
t = exec(fLO | fHI | sLO | sHI);
(f | sum node) | asPercent (t | sum node)
If you squint you will see that this query actually executes fetches 6 times. Which leads to abysmal performance and runs us very quickly into QoS limits.
The ideal query would look like:
f = fetch name:failures node:* ... | sum node;
t = fetch name:{failures,successes} node:* ... | sum node;
(f) | asPercent (t)
The reason "rollup" is in quotes, is this is effectively this data:
node:A => 9
node:B => 11
node:C => 12
We know this is not possible today with R2, but we are asking the experts if this could ever be possible, and some hint as towards how we could submit a pull request for such a change.
Currently the workaround we are using today is triple emission:
node-lo:A node-hi:B => 4
node:A => 4
node:B => 4
node-lo:A node-hi:C => 5
node:A => 5
node:C => 5
node-lo:B node-hi:C => 7
node:B => 7
node:C => 7
... Which is not ideal.
Full credit to @NonLogicalDev for the write up.