rally
rally copied to clipboard
Search profile telemetry device
It's be sweet if I could add something like --telemetry search-profile
and rally could invoke searches on additional time with profile: true
and save the profile. I'm thinking something like:
- If the
operation-type
issearch
. - After the warmup and measurement phases
- Rerun the search one final time, adding
profile: true
to the top level of the search - Save the results to the telemetry cluster
Maybe we could add them to the output somehow but I'm not super sure how. The profile output is pretty free form and you really just need to dig into it with jq
most of the time to grab things. And we're not running it on every execution so the perf numbers that come out aren't super useful. Mostly we're looking for stuff like "what queries were actually executed?" "what implementation decisions did the aggregation make?" "how many segments got to use optimal paths?" "did the fetch phase get to take optimal paths?"
@tmgordeeva, @salvatore-campagna, and I talked about doing this by hand a few days ago. And @DJRickyB and I just talked about wanting to get this information from a nightly. So maybe it's worth building something to do this.
The feature as you've described it is not supported by the current structure of Rally
- Most pedantically, I don't think this would be a telemetry device as we've never coupled telemetry devices with specific operations before
- And-one execution may be doable but running this via the current mechanism would mean at best sticking this in a new iteration type (
profile
in addition towarmup
andnormal
) and nesting the profile in themeta
field of the metric object.
How about either of these:
- We influence global/local profiling via
--profile=true
in the CLI or"profile": true
at the task-level, exactly as we have--on-error=abort
or"ignore-response-error-level": "non-fatal"
. This is not meant for true measurements but will generate a profile for each configured iteration, and stick it in the metric store in themeta
field as described above - We add a new
profile
sub-command which (like--test-mode
) runs queries a limited number of times (once?) BUT outputs a JSON file where the top-level keys are the tasks (which we already enforce uniqueness on) and the values are the response objects from thesearch
type tasks, including withprofile: true
in the request. We'd still support track parameters (to influence things likeingest_percentage
) and--include-tasks
and--exclude-tasks
I think I like the second one better, and it doesn't involve retrieval from a larger document in the metrics store.
I think it's useful to get the profiling against a fully loaded data set that's "hot" from a benchmark run. It's a little more "real".
I do like the idea of dumping this to a json file locally, though it could be useful to get the profile results from last night's benchmark run, so it might not make sense to just write them to a json file. For what it's worth I tend to use jq
on the results of running the profiler to dig out the interesting pieces. Having the whole thing is nice, but I often have to tabularize it with jq
and then dig further. So a json file would be wonderful.
I don't really care about whether or not this is a telemetry device. I guess to me it feels like a telementry device because it gets extra data about the run. But regular telemetry devices don't work that way, for sure. Also! It might disrupt the benchmark. It's rare, but possible for profile: true
to convince the jvm that a particular call is megamorphic. If that's on the hot path it can slow down queries. It shouldn't, but computers are fun!