Load generator tests on the last cache
Once the last cache has been implemented we will want to run a series of load generator tests to see how it performs compared to SQL queries that would be used in its absence.
The only setup required in the load generator should be to create the specs that exercise the queries below. We could have the load generator create the last cache, but will probably just be easy enough to write some data in and create the cache using the CLI, before running the load gen tests.
Scenarios
Here are some scenarios we want to test.
1. Basic last value queries
In this case, the cache does not need to be keyed on any columns.
Using SQL
SELECT * FROM 'table' ORDER BY time DESC LIMIT 1
Using Last Cache
SELECT * FROM last_cache('table')
2. Multi-level tag/key hierarchy
In this case, the data has a hierarchical tag set, e.g., region/host/cpu. The last cache is keyed using the hierarchy region -> host -> cpu, and we want to compare query performance when using different combinations of predicates.
Using SQL
In general, situations where attempting to pull the N-most-recent values for a set of time series, we can use a combination of a ranking function, e.g., ROW_NUMBER() and PARTITION BY like so:
WITH ranked AS (
SELECT
*,
ROW_NUMBER() OVER (PARTITION BY host ORDER BY time DESC) AS rn
FROM cpu
WHERE <predicate>
)
SELECT * FROM ranked
WHERE rn <= N
Here, predicate can be, e.g.,
-
host IN ('_', '_', ...) -
region IN ('_', '_', ...) -
host = '_' AND cpu = '_' -
host = '_' -
region = '_' - etc.
Using Last Cache
Here, the last cache is doing the work for us, so we really just need to provide the above predicates like so:
SELECT * FROM last_cache('cpu') WHERE host IN ('_', '_', ...)
SELECT * FROM last_cache('cpu') WHERE region IN ('_', '_', ...)
SELECT * FROM last_cache('cpu') WHERE host = '_' AND cpu = '_'
SELECT * FROM last_cache('cpu') WHERE host = '_'
SELECT * FROM last_cache('cpu') WHERE region = '_'
I'm guessing that the amount of data you write into the table before running the query test is going to have a significant impact on the performance of the SQL queries. It would be interesting to run it with some different amounts of historical data.
It would be interesting to run it with some different amounts of historical data.
Yes, definitely. I would also like to compare performance/memory usage between the two when using higher cardinality key columns.
Ran a preliminary test yesterday to compare query performance between the following two queries:
- No Cache (yellow line):
SELECT * FROM data WHERE t1 = $t1_id ORDER BY time LIMIT 1
- With Cache (blue line):
SELECT * FROM last_cache('data') WHERE t1 = $t1_id
Details
- Used a single querier in either case, which fires sequential queries as fast as possible using a different parameterized value for the
WHEREclause - In both cases, I was running a write load at the same time, which was writing ~10k lines (75kB) per second.
- In the (1.) no cache case, there was no last cache configured on the table at all to remove any computation involved in poulating the cache
- Either case was run for 60 minutes, starting with a fresh database
The query latency for the (2.) with cache case doesn't show up because it is getting rounded down to 0 ms, but you can see from the queries per second that it is hovering around the 4-500 µs mark.
Oh, one detail omitted:
- The table in either case has a single tag
t1with cardinality 1k, and two fields: a random string and random float - The cache was keyed on
t1and stored both fields (plustime)
Will be interesting to see what it looks like if you have 100 values you're pulling back like WHERE t1 in [...]. That query is going to be gnarly on the SQL side. And then an example that uses the last cache hierarchy to pull back an entire group of last values.
I filed https://github.com/influxdata/influxdb/issues/25174 to add support for more filter Exprs - we can definitely support the IN clause.
That query is going to be gnarly on the SQL side.
💯 not looking forward to composing that.
Update: I figured out the general query structure to select N-recent values from multiple time series. Note, that there is an open issue in Datafusion to optimize such queries (see https://github.com/apache/datafusion/issues/6899), so we should re-run this analysis when that optimization is implemented.
The issue description was updated with the relevant details.
Test: Last 5 Values - 1M Cardinality - Grouping by tag/key
Setup
In this test I compared the following two queries under the same write load.
In either case, the query test was run for 60 minutes, with the same write load running in parallel. Each query test uses a single querier that fires queries sequentially, one after the other, as fast as it can.
1. No Cache (Yellow)
WITH ranked AS (
SELECT
*,
ROW_NUMBER() OVER (PARTITION BY t3 ORDER BY time DESC) AS rn
FROM data
WHERE t2 = $t2_id
AND time > now() - INTERVAL '5 minutes'
)
SELECT * FROM ranked
WHERE rn <= 5
- We are grouping on the
t2tag, but partitioning on thet3tag, because there will be many uniquet3values for eacht2 - The
WHERE rn <= 5will pull the 5 most recent - I used a look back time of 5 minutes - choosing an interval that matches the TTL of the cache (4 hours) would likely make the SQL performance look much worse, so I shortened the interval to something I thought would be reasonable, given the write load detailed below
2. With Cache (Blue)
SELECT * FROM last_cache('data') WHERE t2 = $t2_id
- The cache uses the key
[t1, t2, t3] - The cache count is 5, so will store at most 5 items
Write Load / Data Spec
| Parameter | Detail |
|---|---|
Tags: name (cardinality) |
t1 (10), t2 (1k), t3 (1M) |
Fields: name (type) |
f1 (random string), f2 (random float) |
| Lines per sample | 10,000 |
| Writer Count | 5 |
| Writes per second | 50,000 |
| Segment Duration | 5 minutes |
Results
- Queries Per Second is actually Rows Returned Per Second
- The yellow line is the (1.) No Cache scenario, while the blue line is the (2.) With Cache scenario
- Query latency for (1.) did not saturate, i.e., may continue to degrade, if the test was run for longer, while query latency for (2.) clearly saturated at 75-100 ms
- CPU usage with the cache (2.) is stable compared to without it (1.)
- Memory usage is comparable between the two
Discussion
Because of the data layout, there should be ~5k t3 values for each unique t2, so there should be ~5k rows returned per query.
The result above isn't awesome, but I don't want to over analyze this yet; here are some of the next tests I plan to try:
- [x] The same as this, using a count of 1, i.e., only cache/return the last value (see https://github.com/influxdata/influxdb/issues/25127#issuecomment-2248644354)
- [ ] Use a
t3 IN (...)style query to select a smaller subset oft3values
Test: Last Value - 1M Cardinality - Grouping by tag/key
Setup
This test setup is almost the same as the previous (https://github.com/influxdata/influxdb/issues/25127#issuecomment-2247968491), with the exception that the SQL query was changed to return a single value, and the cache to only store a count of one value.
1. No Cache (Yellow)
WITH ranked AS (
SELECT
*,
ROW_NUMBER() OVER (PARTITION BY t3 ORDER BY time DESC) AS rn
FROM data
WHERE t2 = $t2_id
AND time > now() - INTERVAL '5 minutes'
)
SELECT * FROM ranked
WHERE rn <= 1 -- <- changed to 1
- The
WHERE rn <= 1will pull the most recent value
2. With Cache (Blue)
SELECT * FROM last_cache('data') WHERE t2 = $t2_id
- The cache uses the key
[t1, t2, t3] - The cache count is 1, so will store at most 1 item
Write Load / Data Spec
| Parameter | Detail |
|---|---|
Tags: name (cardinality) |
t1 (10), t2 (1k), t3 (1M) |
Fields: name (type) |
f1 (random string), f2 (random float) |
| Lines per sample | 10,000 |
| Writer Count | 5 |
| Writes per second | 50,000 |
| Segment Duration | 5 minutes |
Results
- Queries Per Second is actually Rows Returned Per Second
- The yellow line is the (1.) No Cache scenario, while the blue line is the (2.) With Cache scenario
This is a zoom in on the (2.) With Cache scenario to see the range of query latencies a bit more clearly:
Discussion
It is still difficult to see, but the query latency when using the cache is almost a square wave, toggling between ~5ms and ~100ms. It would be worth profiling to see what might be causing the slowness during the 100ms periods.
Although its gradual, the (1.) No Cache query latency is degrading over time, and did not saturate during the test, while the (2.) With cache queries look stable and also reduce the load on the CPU by a factor of 5-8x.
The SQL query you're executing is only looking back 5 minutes, which is almost never what users do when they're looking for last values. That only works if they've actually written a value in the last 5 minutes. Also, your SQL query isn't grouping by the actual tag, so I don't believe you're actually pulling in the last value for every unique t3 in the given t2 group.
Oh, maybe the partition by t3 accomplishes the grouping?
The SQL query you're executing is only looking back 5 minutes, which is almost never what users do when they're looking for last values. That only works if they've actually written a value in the last 5 minutes.
Yeah, the 5 minute look back might be a bit optimistic in the general sense, I figured it was acceptable given the write load should have values written in that time. I could try with a more conservative value - I originally was going to use the same as the cache TTL, which is 4 hours, but wanted to give the SQL a fighting chance 😅.
Also, your SQL query isn't grouping by the actual tag, so I don't believe you're actually pulling in the last value for every unique t3 in the given t2 group.
Yeah, it is a bit odd, but the PARTITION BY t3 is what does the grouping. WHERE t2 = $t2_id fixes on a given t2, for which there are many t3 values, then the PARTITION BY t3 groups by unique t3, i.e., by partitioning for each unique t3, and then ranks within the partition.
I am currently running a similar test to the above, but using an IN predicate that selects a fixed set of 100 t3 values. Once I am done that, I can try using a longer look back in the SQL queries.
Test: Last Value - 1M Cardinality - WHERE t3 IN (...)
This test setup is similar to previous, but uses an IN clause to select a specific subset of 100 t3 values (out of the total 1M t3 values in the data spec).
1. No Cache (Yellow)
WITH ranked AS (
SELECT
*,
ROW_NUMBER() OVER (PARTITION BY t3 ORDER BY time DESC) AS rn
FROM data
WHERE t3 IN ('t3-id-10000', 't3-id-20000', ..., 't3-id-999999') -- 100 specific values total
AND time > now() - INTERVAL '5 minutes'
)
SELECT * FROM ranked
WHERE rn <= 1
2. With Cache (Blue)
SELECT * FROM last_cache('data') WHERE t3 IN ('t3-id-10000', 't3-id-20000', ..., 't3-id-999999') -- 100 specific values total
- The cache uses the key
[t1, t2, t3] - The cache count is 1, so will store at most 1 item
Write Load / Data Spec
| Parameter | Detail |
|---|---|
Tags: name (cardinality) |
t1 (10), t2 (1k), t3 (1M) |
Fields: name (type) |
f1 (random string), f2 (random float) |
| Lines per sample | 10,000 |
| Writer Count | 5 |
| Writes per second | 50,000 |
| Segment Duration | 5 minutes |
Results
- Queries Per Second is actually Rows Returned Per Second
- The yellow line is the (1.) No Cache scenario, while the blue line is the (2.) With Cache scenario
This is a zoom in on the (2.) With Cache scenario to see the range of query latencies a bit more clearly:
Discussion
When looking at the actual measured latencies in the (2.) With Cache case, there is a similar trend to before, where the latencies are like a square wave, this time toggling between ~3ms and ~100ms. The graph doesn't really capture this because of the windowing of the latencies the analysis app is doing to produce the graph.
So, again, probably worth profiling to see what is going on during those ~100ms stretches.
Definitely worth a look on the profiling. I think the new WAL and write buffer refactor might have a big impact here because the write locking behavior is going to change quite a bit.
Closing this. The cache was tested with the load generator above. We can open more specific issues following https://github.com/influxdata/influxdb/issues/25562 with respect to profiling and testing the last cache.