hurl
hurl copied to clipboard
add the --concurrency option to launch multiple runs of *.hurl files instead of one
Hi guys 😄
In the context of our project we see two huge advantages using hurl :
- It is an ideal solution for functionally testing all the page sequences what we call
customer paths
on our web application. This allows us to guarantee the non-regression between versions. - The simplicity of its language allows non-developers to generate a multitude of scenarios easily.
However, we cannot "yet" use hurl for our metrology
tests, forcing us to duplicate all our scenarios in other tools that are less accessible to non-developers... 😢
Currently when I launch this command
hurl a.hurl b.hurl c.hurl
hurl runs each *.hurl
file one time, first a.hurl, then b.hurl, then c.hurl.
─ BEGIN
├─ a.hurl
├─ b.hurl
├─ c.hurl
─ END
I propose you to add the --concurrency
option allowing hurl to launch multiple runs of each *.hurl file at a time, for example:
hurl --concurrency 3 a.hurl b.hurl c.hurl
─ BEGIN
├─ a.hurl ├─ a.hurl ├─ a.hurl
├─ b.hurl ├─ b.hurl ├─ b.hurl
├─ c.hurl ├─ c.hurl ├─ c.hurl
─ END
What do you think about it ?
This one is not that easy, there are many ways to do it! One of the main objective of Hurl is to stay as simple as possible, which might not be compatible with performance testing.
Maybe, we could distribute another binary that would also process hurl files but dedicated to performance testing. That's what we have done with the binary hurlfmt, which only deals with linting and formatting.
Completely agree with your proposal.
@lepapareil I agree with @fabricereix there are for sure several ways to implement such a feature.
Nevertheless, I would also like to see a "concurrent" and "bulk" (related to #1139) feature in hurl
, but I would suggest to provide it on language level.
Since the hurl
request/response file description comes already very close to test/benchmark plans a possible addition would be to specify optional THREADS
and REPEATS
values on the top-level of the specification as well as an optional setting on request/response level.
The following hurl
specification would be using the two new language keywords expressing the same semantics as if the are not present at all:
THREADS 1
REPEATS 1
POST {{LOCATION}}/endpoint
{
"key": "value"
}
HTTP 200 #
THREADS 1
REPEATS 1
Using a generic mechanism like this would allow you to write almost all the combinations necessary to write proper test and benchmark plans per hurl
file basis.
By introducing first this feature would allow in an additional step to e.g. introduce "nesting" of sub hurl
specifications e.g. using an INCLUDE <file>
(without .hurl
extension) of other hurl
specifications to formulate @lepapareil's example for example like this:
INCLUDE a
REPEATS 3
INCLUDE b
REPEATS 3
INCLUDE c
REPEATS 3
Since this is the nature of sequential execution only the keyword REPEATS
would be necessary for this use case.
For the use case of #1139 of @hemedani it depends what test behavior it's desired, but even there the "bulk" testing can be then achieved in a parallel and sequential fashion, e.g.:
1000 runs in sequence of the whole hurl
test:
REPEATS 1000
GET http://google.com
1000 runs in parallel of the whole hurl
test:
THREADS 1000
GET http://google.com
1000 runs whereas 100 in sequence times 10 in parallel of the whole hurl
test:
THREADS 10
REPEATS 100
GET http://google.com
Similar to the delay
option, we could also add a repeat
option.
GET http://google.com
[Options]
repeat: 1000
@fabricereix sounds good to me for the repetition use case.
What about the "load testing" use case for parallel request scenarios?
Then here an additional thread: <concurrency-number>
option would be suitable as well like:
GET http://google.com
[Options]
repeat: 3
thread: 2
Like I stated above it would be nice to specify the repeat
and thread
option (in this case) also at the top-level of the hurl
specification to express a test/benchmark scenario like:
[Options]
repeat: 2
thread: 2
POST https://example.org
[Options]
repeat: 3
GET https://example.org
[Options]
thread: 3
The whole specification is repeated twice whereas one run is running in parallel 2 threads (not synchronized, only at the end). The first request is only executing the requests in sequential fashion, whereas the second request of a test run is executed three times in parallel and synchronized at the end of the response (bounded parallelism).
A visual of the example test specification would be:
flowchart LR
s0(hurl) -->|run| s1{{fork}}
s1 --> s2
s1 --> s3
s6{{join}}
s7{{end}}
s2 --> s4
s3 --> s5
s4 --> s6
s5 --> s6
s6 --> s7
subgraph run
direction LR
subgraph s2[1st thread, 1st repetition]
direction LR
subgraph s20[repeat 3]
direction LR
s200[POST] --> s201
s201[POST] --> s202
s202[POST]
end
subgraph s21[thread 3]
direction LR
s210[GET]
s211[GET]
s212[GET]
s213{{fork}}
s214{{join}}
s213 --> s210
s213 --> s211
s213 --> s212
s210 --> s214
s211 --> s214
s212 --> s214
end
s20 --> s21
end
subgraph s3[2ns thread, 1st repetition]
s30[same as 1st thread, 1st repetition]
end
subgraph s4[1st thread, 2nd repetition]
s40[same as 1st thread, 1st repetition]
end
subgraph s5[2nd thread, 2nd repetition]
s50[same as 1st thread, 1st repetition]
end
end
For the concurency usecase, it would be interesting to try to do it first with an external script using parallel. It would help to find good semantics.
I agree on a global concurrency related stress/load test ... but not a systematically one, because the threads run out of sync and parallel
is just interested in finishing overall all forked threads.
Furthermore, no total test diagnostics is computed and can be used more for a fire and forget kinda testing IMHO.
This feature would help me out as well. I was looking for something simple to just hammer our Rails apps a bit to test some NGINX configs. Jmeter is overkill and Apache Bench is a bit to simple; hurl seems to be just right.
I spent a bit of time getting a simple test.hurl
file working for our API and started to see if I could use it to stress/load test our systems. After reading this thread I'm a bit disappointed that I can't use Hurl for this and just wanted to post my support for this feature.
Is there a beta branch or something that I could test with?
@rnhurt sorry for this, we know it's a very important feature and we haven't begin to work on it (Hurl is unfortunately just a side project). I can insure you that it's in our top priority, and more important, we really want to make it "correct". In a word, we need some times... We are using Hurl In House, and for concurrency / performance tests, we're writing simple shell scripts that benefit from Hurl asserts capabilities. The aim, of course, it to be able to do such tests with Hurl only.
@jcamiel how far is the design or development of this feature? Because, I could provide a proposed solution as sketched above in https://github.com/Orange-OpenSource/hurl/issues/88#issuecomment-1674518247.
Hi @ppaulweber we have just begin thinking on it. I would prefer that it will be tackle by the core maintainers just because we want to discuss on how to do this:
- what do you have to change in our current code (and if we have to do some prework)
- what "public" API for the crates we'll expose
- how do we present stdout / stderr ?
- how do we display parallel running test in
--test
mode ?
I can't prevent you from work on it of course !! But this feature will certainly be very structural and we want to be totally sure on how to address on code on it.
Instead of coding, maybe we're going to put our thought on a working document to discuss how to design the code. We've done this for the Hurl grammar and this kind of specification / preparation has really helped us to make "robust" code. As soon as we've started such a document I'll ping you so you can contribute with your inputs.
Initialisation of an architecture document here => /docs/spec/runner/parallel.md
Just some notes on my attempt on this with hurl. This is the first project I've used hurl for some testing and when performance testing came up I immediately reached for hurl as all the use cases for the perf testing were already defined, which is how I ended up in this thread...
I pre processed a set of tests into one perf.hurl
file from the group of tests required, which let me create the load test in the kind of shape that regular request loads look like.
This is basically
cat test1.hurl > perf-single.hurl
cat test2.hurl > perf-single.hurl
cat test3.hurl > perf-single.hurl
cat test4.hurl > perf-single.hurl
cat test2.hurl > perf-single.hurl
cat test3.hurl > perf-single.hurl
cat test2.hurl > perf-single.hurl
# Then to elongate the test.
cat perf-single.hurl perf-single.hurl perf-single.hurl perf-single.hurl > perf.hurl
The test sequence is the same for each hurl
process but c'est la vie, parallel takes a little while to start each process up so runs are out of sync immediately.
dt=$(date -u '+%Y-%m-%dT%H:%M:%SZ')
n=20
outdir="output/run-${dt}-${n}/"
mkdir "${outdir}"
printf 'writing perf results to "%s"\n' "${outdir}"
parallel -j$n hurl --continue-on-error --json --output "${outdir}/thread-{1}.json" perf.hurl ::: {1..$n}
# or moreutils parallel
parallel -j$n -i hurl --continue-on-error --json --output "${outdir}/thread-{}.json" perf.hurl -- $(seq 1 $n)
Then post process the timings out of the json files, which basically amounts to jq filters like .entries[].calls[] | [.request.method, .request.url, .request.queryString, .timings]
. Then grouping by URL, getting min/max/avg/median/percentiles for timing. I'd prefer to group results by test, but in order to get the required load distribution with perf.hurl
I'm not sure how to achieve that? Running multiple .hurl
files in one invocation leaves only JSON results for the final file. I could probably extract line numbers from the json to match that to test cases.
The json output is a bit verbose for a load test, my simple 2000 request perf.hurl
json file is generating 1.8MB per process/thread but I can't see another option for getting the detailed test timings out of a run.
Nice things to have.
- Less pre processing. If there is an existing set of hurl test files, I'd like to run a group of files, repeatedly, in some distribution, across N clients/threads/processes.
- Less post processing I'm mainly interested in error occurrences and overall timing min/max/average/percentiles. Then the same info grouped per "test". At the moment I am limited to grouping by URL due to the
.hurl
pre processing. Possibly adding some extra detail for the erroring or slow tests that occur. - It's probably hard to meet all stats needs, so maybe just the ability to output a csv/json/whatever of selected results/timings (datetime,thread,test,test_count,url,res,timings...)
Interactive output doesn't really matter much to me, possibly a test count and error count so a test run can be killed when the server blows up, and a quick summary of totals at the end.
@mhio thanks you very much for the feedbacks, it's very interesting and useful. I've started a spec documentation for the parallel feature here => https://github.com/Orange-OpenSource/hurl/blob/master/docs/spec/runner/parallel.md for the moment, it's just a list of various things to address, before the implementation. For the past, we've worked on Hurl with spec document before implementation and it helps us a lot.
Hi all,
This is much overdue, but we've implemented --parallel
option to executes Hurl file in parallel, on master. This is an optional flag on the 4.3.0 version (released next week, but already available on master), and will be officially supported in Hurl 5.0.0 (the version after the 4.3.0).
The model used is similar to GNU parallel
, to run some tests in parallel, you can just run :
$ hurl --test --parallel *.hurl
The parallelism used is multithread sync: a thread pool is instantiated for the whole run, each Hurl file is run in its own thread, synchronously . We've not gone through the full multithreaded async route for implementation simplicity. Moreover, there is no additional dependency, only the standard Rust lib. @ppaulweber we've chosen not to expose a kind of "thread affinity" inside a Hurl file, once again for simplicity of implementation. The only user option is --jobs
to set the size of the thread pool. By default, the size is roughly the number of available CPUs.
Regarding stdout/stderr, we've, once again, followed the GNU parallel model
: standard output and error are buffered during the execution of a file, and only displayed when a file has been executed. As a consequence, the debug logs can be a little delayed, but logs are never intermixed between Hurl files.
One can use debugging for a particular file with [Options]
section and everything should be working as intented:
GET https://foo.com
[Options]
verbose: true
HTTP 200
In test mode, the progress bar is a little different from the non-parallel run, it will be harmonised for the official release (the sequential test progress will look like running hurl --test --parallel --jobs 1
).
Regarding report, the HTML, TAP, JUnit reports are not affected: reported tests, in parallel or in sequential mode, are in the same order as execution one. For instance:
$ hurl --test --report-tap a.hurl b.hurl c.hurl
Will always produced this TAP report, in this order, no matter what file is executed first:
TAP version 13
1..3
ok 1 - a.hurl
ok 2 - b.hurl
ok 3 - c.hurl
What's next:
- a lot of tests: we really want to be sure that everything is OK
- maybe some option for the first version: like
GNU parallel
a--keep-order
option to output standard output in the command line order of the files. After this first version, we'll add more option of course (for repeating sequence etc...), base on usage and feedbacks - add a throttle on terminal display:
cargo
do this and we'll add it as the refresh rate can ve very high for the terminal - feedback! We'll really be happy to have feedback on the new feature: it's really exciting, Hurl is already fast; with parallel execution is incredibly fast!
@mhio I'm interested to know if this could potentially replace your usage with parallel !
(same announce made on #87)
@jcamiel thanks for the update, can't wait in this case for hurl v4.3.0/v5.0.0
:tada:. Maybe we can have a additional discussion on how to introduce formal described load test scenarios inside a hurl specification in the near future.
I'm closing this issue, we'll open new, more specific issues from now.