rally
rally copied to clipboard
Decouple informational Rally subcommands from rally-tracks master branch
Currently certain informational Rally subcommands (like esrally list tracks or esrally info --track=...) implicitly reference the master branch of the default track repository (rally-tracks).
This means that changes in https://github.com/elastic/rally-tracks/ (master branch) such as https://github.com/elastic/rally-tracks/pull/171#issuecomment-828598581, can subtly break the list or other subcommands.
This is problematic as frequently the master branch of https://github.com/elastic/rally-tracks holds bleeding edge work and commits there should never break workflows of users using released versions of Rally against released versions of Elasticsearch.
We should take a deeper look at existing Rally functionality that implicitly uses rally-tracks#master and make things more resilient.
@dliappis @danielmitterdorfer
I started looking into this and chatted about it briefly with @DJRickyB. If I were to restate--but maybe expand--the problem, I'd say that we would want a given released version of Rally to default to a "known-good", compatible revision of rally-tracks in the absence of the user providing either a distribution-version or a track-revision[^1].
If that's a fair characterization, my initial (possibly naive, you tell me!) thought is that we'd need a three-pronged approach in order to provide a robust solution:
- Before a Rally release, test
rally-tracksagainst the commit in therallyrepository that we intend to release. We're already discussing how to improve CI for tracks, so this would dovetail with that. - Assuming tests pass, tag the commit of
rally-trackstested above with the same tag applied to the newly-released version of Rally. - Alter Rally's behavior such that for released versions of Rally, it defaults to checking out the corresponding tag in the
rally-tracksrepository. The user can override this, however, via--track-revisionor--distribution-version.
In the short term, we could of course simply issue a warning when a released version of Rally is running against rally-tracks/master. But I like the idea of providing stronger compatibility guarantees by default, even if it'll be a little tricky to implement.
There's probably a number of subtleties here that we'd need to think through. But what say you about the overall idea?
[^1]: BTW, it seems that the GitTrackRepository's constructor will prefer a track revision over a distribution version. Might it make sense to make --track-revision and --distribution-version mutually exclusive for the esrally list tracks command?
As far as I understand the proposal, tagging commits would solve the issue on the master branch on rally-tracks (or rally-teams) but not necessarily on any other branch in these repos, like 7.14. Probably it would also make it harder for us to add specialized support for new versions of Elasticsearch - by introducing a new branch in said repos - without releasing a new version of Rally? Additionally, this approach would only work with track / team repos that Rally supports out of the box? Also, we should maybe stay way from adding more logic on top of the - already complex - branching logic?
If we take a step back, our goal is that we don't break released versions of Rally with changes to rally-tracks / rally-teams. I wonder whether we should start running our integration test suite in CI not only with Rally master but also with the most recent released version of Rally? If we additionally set this up as a PR check on the respective repos we'd catch these mistakes before they are merged and would achieve our goal as well?
Regarding your footnote:
BTW, it seems that the GitTrackRepository's constructor will prefer a track revision over a distribution version. Might it make sense to make --track-revision and --distribution-version mutually exclusive for the esrally list tracks command?
IMHO it makes sense to keep this behavior: By default, Rally will pick the correct version based on the --distribution-versioncommand line parameter but you might want to run a benchmark for a released Elasticsearch version but with an older version of the Rally track, e.g. because you need to determine whether a change in performance is caused by a change in the workload. And in that sense it makes sense to be able to override the track revision that is picked.
If we take a step back, our goal is that we don't break released versions of Rally with changes to
rally-tracks/rally-teams. I wonder whether we should start running our integration test suite in CI not only with Rally master but also with the most recent released version of Rally? If we additionally set this up as a PR check on the respective repos we'd catch these mistakes before they are merged and would achieve our goal as well?
I agree. Our goal is not to break the latest master or released versions of Rally with changes to rally-tracks or rally-teams. I feel that ensuring that cross repo IT tests (one way or the other i.e. via cross repo triggers or thorough synthetic tests per repo) would secure us against this possibility without having to tackle this programmatically in Rally.