hurl add the --concurrency option to launch multiple runs of *.hurl files instead of one

Hi guys 😄

In the context of our project we see two huge advantages using hurl :

It is an ideal solution for functionally testing all the page sequences what we call customer paths on our web application. This allows us to guarantee the non-regression between versions.
The simplicity of its language allows non-developers to generate a multitude of scenarios easily.

However, we cannot "yet" use hurl for our metrology tests, forcing us to duplicate all our scenarios in other tools that are less accessible to non-developers... 😢

Currently when I launch this command

hurl a.hurl b.hurl c.hurl

hurl runs each *.hurl file one time, first a.hurl, then b.hurl, then c.hurl.

─ BEGIN
       ├─ a.hurl
                ├─ b.hurl
                         ├─ c.hurl
─ END

I propose you to add the --concurrency option allowing hurl to launch multiple runs of each *.hurl file at a time, for example:

hurl --concurrency 3 a.hurl b.hurl c.hurl

─ BEGIN
       ├─ a.hurl ├─ a.hurl ├─ a.hurl
                                    ├─ b.hurl ├─ b.hurl ├─ b.hurl
                                                                 ├─ c.hurl ├─ c.hurl ├─ c.hurl
─ END

What do you think about it ?

Nov 20 '20 18:11 lepapareil

This one is not that easy, there are many ways to do it! One of the main objective of Hurl is to stay as simple as possible, which might not be compatible with performance testing.

Maybe, we could distribute another binary that would also process hurl files but dedicated to performance testing. That's what we have done with the binary hurlfmt, which only deals with linting and formatting.

Nov 21 '20 16:11 fabricereix

Completely agree with your proposal.

Nov 21 '20 19:11 lepapareil

@lepapareil I agree with @fabricereix there are for sure several ways to implement such a feature.

Nevertheless, I would also like to see a "concurrent" and "bulk" (related to #1139) feature in hurl, but I would suggest to provide it on language level.

Since the hurl request/response file description comes already very close to test/benchmark plans a possible addition would be to specify optional THREADS and REPEATS values on the top-level of the specification as well as an optional setting on request/response level.

The following hurl specification would be using the two new language keywords expressing the same semantics as if the are not present at all:

THREADS 1
REPEATS 1

POST {{LOCATION}}/endpoint
{
  "key": "value"
}
HTTP 200 # 
THREADS 1
REPEATS 1

Using a generic mechanism like this would allow you to write almost all the combinations necessary to write proper test and benchmark plans per hurl file basis.

By introducing first this feature would allow in an additional step to e.g. introduce "nesting" of sub hurl specifications e.g. using an INCLUDE <file> (without .hurl extension) of other hurl specifications to formulate @lepapareil's example for example like this:

INCLUDE a
REPEATS 3

INCLUDE b
REPEATS 3

INCLUDE c
REPEATS 3

Since this is the nature of sequential execution only the keyword REPEATS would be necessary for this use case. For the use case of #1139 of @hemedani it depends what test behavior it's desired, but even there the "bulk" testing can be then achieved in a parallel and sequential fashion, e.g.:

1000 runs in sequence of the whole hurl test:

REPEATS 1000

GET http://google.com

1000 runs in parallel of the whole hurl test:

THREADS 1000

GET http://google.com

1000 runs whereas 100 in sequence times 10 in parallel of the whole hurl test:

THREADS 10
REPEATS 100

GET http://google.com

Aug 11 '23 06:08 ppaulweber

Similar to the delay option, we could also add a repeat option.

GET http://google.com
[Options]
repeat: 1000

Aug 11 '23 09:08 fabricereix

@fabricereix sounds good to me for the repetition use case.

What about the "load testing" use case for parallel request scenarios? Then here an additional thread: <concurrency-number> option would be suitable as well like:

GET http://google.com
[Options]
repeat: 3
thread: 2

Like I stated above it would be nice to specify the repeat and thread option (in this case) also at the top-level of the hurl specification to express a test/benchmark scenario like:

[Options]
repeat: 2
thread: 2

POST https://example.org
[Options]
repeat: 3

GET https://example.org
[Options]
thread: 3

The whole specification is repeated twice whereas one run is running in parallel 2 threads (not synchronized, only at the end). The first request is only executing the requests in sequential fashion, whereas the second request of a test run is executed three times in parallel and synchronized at the end of the response (bounded parallelism).

A visual of the example test specification would be:

flowchart LR

s0(hurl) -->|run| s1{{fork}}
s1 --> s2
s1 --> s3
s6{{join}}
s7{{end}}
s2 --> s4
s3 --> s5

s4 --> s6
s5 --> s6
s6 --> s7
subgraph run
direction LR

subgraph s2[1st thread, 1st repetition]
direction LR

subgraph s20[repeat 3]
direction LR
s200[POST] --> s201
s201[POST] --> s202
s202[POST]
end

subgraph s21[thread 3]
direction LR
s210[GET]
s211[GET]
s212[GET]
s213{{fork}}
s214{{join}}

s213 --> s210
s213 --> s211
s213 --> s212
s210 --> s214
s211 --> s214
s212 --> s214
end

s20 --> s21
end

subgraph s3[2ns thread, 1st repetition]
s30[same as 1st thread, 1st repetition]
end

subgraph s4[1st thread, 2nd repetition]
s40[same as 1st thread, 1st repetition]
end

subgraph s5[2nd thread, 2nd repetition]
s50[same as 1st thread, 1st repetition]
end

end

Aug 11 '23 10:08 ppaulweber

For the concurency usecase, it would be interesting to try to do it first with an external script using parallel. It would help to find good semantics.

Aug 11 '23 12:08 fabricereix

I agree on a global concurrency related stress/load test ... but not a systematically one, because the threads run out of sync and parallel is just interested in finishing overall all forked threads. Furthermore, no total test diagnostics is computed and can be used more for a fire and forget kinda testing IMHO.

Aug 11 '23 12:08 ppaulweber

This feature would help me out as well. I was looking for something simple to just hammer our Rails apps a bit to test some NGINX configs. Jmeter is overkill and Apache Bench is a bit to simple; hurl seems to be just right.

I spent a bit of time getting a simple test.hurl file working for our API and started to see if I could use it to stress/load test our systems. After reading this thread I'm a bit disappointed that I can't use Hurl for this and just wanted to post my support for this feature.

Is there a beta branch or something that I could test with?

Oct 09 '23 13:10 rnhurt

@rnhurt sorry for this, we know it's a very important feature and we haven't begin to work on it (Hurl is unfortunately just a side project). I can insure you that it's in our top priority, and more important, we really want to make it "correct". In a word, we need some times... We are using Hurl In House, and for concurrency / performance tests, we're writing simple shell scripts that benefit from Hurl asserts capabilities. The aim, of course, it to be able to do such tests with Hurl only.

Oct 09 '23 13:10 jcamiel

@jcamiel how far is the design or development of this feature? Because, I could provide a proposed solution as sketched above in https://github.com/Orange-OpenSource/hurl/issues/88#issuecomment-1674518247.

Oct 11 '23 14:10 ppaulweber

Hi @ppaulweber we have just begin thinking on it. I would prefer that it will be tackle by the core maintainers just because we want to discuss on how to do this:

what do you have to change in our current code (and if we have to do some prework)
what "public" API for the crates we'll expose
how do we present stdout / stderr ?
how do we display parallel running test in --test mode ?

I can't prevent you from work on it of course !! But this feature will certainly be very structural and we want to be totally sure on how to address on code on it.

Instead of coding, maybe we're going to put our thought on a working document to discuss how to design the code. We've done this for the Hurl grammar and this kind of specification / preparation has really helped us to make "robust" code. As soon as we've started such a document I'll ping you so you can contribute with your inputs.

Oct 11 '23 15:10 jcamiel

Initialisation of an architecture document here => /docs/spec/runner/parallel.md

Nov 06 '23 15:11 jcamiel

Just some notes on my attempt on this with hurl. This is the first project I've used hurl for some testing and when performance testing came up I immediately reached for hurl as all the use cases for the perf testing were already defined, which is how I ended up in this thread...

I pre processed a set of tests into one perf.hurl file from the group of tests required, which let me create the load test in the kind of shape that regular request loads look like.

This is basically

cat test1.hurl > perf-single.hurl
cat test2.hurl > perf-single.hurl
cat test3.hurl > perf-single.hurl
cat test4.hurl > perf-single.hurl
cat test2.hurl > perf-single.hurl
cat test3.hurl > perf-single.hurl
cat test2.hurl > perf-single.hurl
# Then to elongate the test. 
cat perf-single.hurl perf-single.hurl perf-single.hurl perf-single.hurl > perf.hurl

The test sequence is the same for each hurl process but c'est la vie, parallel takes a little while to start each process up so runs are out of sync immediately.

dt=$(date -u '+%Y-%m-%dT%H:%M:%SZ')
n=20
outdir="output/run-${dt}-${n}/"
mkdir "${outdir}"
printf 'writing perf results to "%s"\n' "${outdir}"
parallel -j$n hurl --continue-on-error --json --output "${outdir}/thread-{1}.json" perf.hurl ::: {1..$n}
# or moreutils parallel
parallel -j$n -i hurl --continue-on-error --json --output "${outdir}/thread-{}.json" perf.hurl -- $(seq 1 $n)

Then post process the timings out of the json files, which basically amounts to jq filters like .entries[].calls[] | [.request.method, .request.url, .request.queryString, .timings]. Then grouping by URL, getting min/max/avg/median/percentiles for timing. I'd prefer to group results by test, but in order to get the required load distribution with perf.hurl I'm not sure how to achieve that? Running multiple .hurl files in one invocation leaves only JSON results for the final file. I could probably extract line numbers from the json to match that to test cases.

The json output is a bit verbose for a load test, my simple 2000 request perf.hurl json file is generating 1.8MB per process/thread but I can't see another option for getting the detailed test timings out of a run.

Nice things to have.

Less pre processing. If there is an existing set of hurl test files, I'd like to run a group of files, repeatedly, in some distribution, across N clients/threads/processes.
Less post processing I'm mainly interested in error occurrences and overall timing min/max/average/percentiles. Then the same info grouped per "test". At the moment I am limited to grouping by URL due to the .hurl pre processing. Possibly adding some extra detail for the erroring or slow tests that occur.
It's probably hard to meet all stats needs, so maybe just the ability to output a csv/json/whatever of selected results/timings (datetime,thread,test,test_count,url,res,timings...)

Interactive output doesn't really matter much to me, possibly a test count and error count so a test run can be killed when the server blows up, and a quick summary of totals at the end.

Jan 07 '24 01:01 mhio

@mhio thanks you very much for the feedbacks, it's very interesting and useful. I've started a spec documentation for the parallel feature here => https://github.com/Orange-OpenSource/hurl/blob/master/docs/spec/runner/parallel.md for the moment, it's just a list of various things to address, before the implementation. For the past, we've worked on Hurl with spec document before implementation and it helps us a lot.

Jan 08 '24 08:01 jcamiel

Hi all,

This is much overdue, but we've implemented --parallel option to executes Hurl file in parallel, on master. This is an optional flag on the 4.3.0 version (released next week, but already available on master), and will be officially supported in Hurl 5.0.0 (the version after the 4.3.0).

The model used is similar to GNU parallel, to run some tests in parallel, you can just run :

$ hurl --test --parallel *.hurl

The parallelism used is multithread sync: a thread pool is instantiated for the whole run, each Hurl file is run in its own thread, synchronously . We've not gone through the full multithreaded async route for implementation simplicity. Moreover, there is no additional dependency, only the standard Rust lib. @ppaulweber we've chosen not to expose a kind of "thread affinity" inside a Hurl file, once again for simplicity of implementation. The only user option is --jobs to set the size of the thread pool. By default, the size is roughly the number of available CPUs.

Regarding stdout/stderr, we've, once again, followed the GNU parallel model: standard output and error are buffered during the execution of a file, and only displayed when a file has been executed. As a consequence, the debug logs can be a little delayed, but logs are never intermixed between Hurl files.

One can use debugging for a particular file with [Options] section and everything should be working as intented:

GET https://foo.com
[Options]
verbose: true
HTTP 200

In test mode, the progress bar is a little different from the non-parallel run, it will be harmonised for the official release (the sequential test progress will look like running hurl --test --parallel --jobs 1).

Regarding report, the HTML, TAP, JUnit reports are not affected: reported tests, in parallel or in sequential mode, are in the same order as execution one. For instance:

$ hurl --test --report-tap a.hurl b.hurl c.hurl

Will always produced this TAP report, in this order, no matter what file is executed first:

TAP version 13
1..3
ok 1 - a.hurl
ok 2 - b.hurl
ok 3 - c.hurl

What's next:

a lot of tests: we really want to be sure that everything is OK
maybe some option for the first version: like GNU parallel a --keep-order option to output standard output in the command line order of the files. After this first version, we'll add more option of course (for repeating sequence etc...), base on usage and feedbacks
add a throttle on terminal display: cargo do this and we'll add it as the refresh rate can ve very high for the terminal
feedback! We'll really be happy to have feedback on the new feature: it's really exciting, Hurl is already fast; with parallel execution is incredibly fast!

@mhio I'm interested to know if this could potentially replace your usage with parallel !

(same announce made on #87)

Apr 04 '24 13:04 jcamiel

@jcamiel thanks for the update, can't wait in this case for hurl v4.3.0/v5.0.0 :tada:. Maybe we can have a additional discussion on how to introduce formal described load test scenarios inside a hurl specification in the near future.

Apr 04 '24 16:04 ppaulweber

I'm closing this issue, we'll open new, more specific issues from now.

Apr 23 '24 08:04 jcamiel

hurl hurl copied to clipboard

add the --concurrency option to launch multiple runs of *.hurl files instead of one

hurl
hurl copied to clipboard