artillery
artillery copied to clipboard
Artillery runs forever and never exit (v2)
When the command artillery quick -d 60 -r 256 https://www.company.com/
is executed, artillery will run fine for the first few request and later keep on looping with the following progress report and never exit.
Report for the previous 10s @ 2017-01-11T09:08:56.069Z
Scenarios launched: 0
Scenarios completed: 0
Requests completed: 0
RPS sent: NaN
Request latency:
min: NaN
max: NaN
median: NaN
p95: NaN
p99: NaN
Scenario duration:
min: NaN
max: NaN
median: NaN
p95: NaN
p99: NaN
Following is the environment it runs in:
root@nodejs-512mb-lon1-01:~/artillery# node -v
v6.9.2
root@nodejs-512mb-lon1-01:~/artillery# artillery -V
1.5.0-22
Following is the log file generated by running DEBUG=* artillery quick -d 60 -r 256 https://www.bloomon.nl/ 2>&1 | tee debug.log
:
https://gist.github.com/endyjasmi/a597adc6a0fc5e1c874d7abbe9a93262
Thanks in advance for the helps.
Thanks for the report @endyjasmi. I'm unable to reproduce the issue here. The following command completes fine for me:
artillery quick -d 60 -r 256 http://localhost:8080/
localhost:8080
is an nginx
instance, the tests complete when left to run normally and when I kill nginx
halfway through a run to simulate the target failing under load.
What does the normal Artillery output look like on your system? Do you see reported latency metrics for the full 60 seconds? Are there any errors reported? What does pgrep -lfa node
output once Artillery is printing empty reports?
@hassy From what I understand, this happens when the connection from artillery to the server is high.
@endyjasmi Can you post a log? (Just the normal output from Artillery as it runs the test). Can you also check the CPU usage of node processes and check that there are two of them still running?
I see the same problem.
Report for the previous 10s @ 2017-02-15T16:24:03.192Z
Scenarios launched: 0
Scenarios completed: 0
Requests completed: 0
Concurrent users: 65
RPS sent: NaN
Request latency:
min: NaN
max: NaN
median: NaN
p95: NaN
p99: NaN
Scenario duration:
min: NaN
max: NaN
median: NaN
p95: NaN
p99: NaN
It ran against an environment that never went down.
The load generator machine did experience high cpu usage, but then it went down to normal levels.
@portenez thanks, can you post the full log? Did you ever see metrics being reported while Artillery was running or always just NaN
?
We just figured it out. It was an exception being thrown during custom js code run via beforeRequest
. We fixed the exception, and the test was able to complete.
So, I guess there's still a bug. That an exception in one of the preProcessors will cause the whole test to hang
Hi folks, I'm seeing this too, but when I'm using a prepared Websocket test. It gets to the end of the test, everything's fine, cluster being tested is fine, and then Artillery just keeps doing this forever:
Report for the previous 10s @ 2017-02-16T17:36:42.428Z
Scenarios launched: 0
Scenarios completed: 0
Requests completed: 0
RPS sent: NaN
Request latency:
min: NaN
max: NaN
median: NaN
p95: NaN
p99: NaN
Scenario duration:
min: NaN
max: NaN
median: NaN
p95: NaN
p99: NaN
@portenez thanks, that's something to fix. @igorclark does it happen every time or only sometimes?
A reproducible test case would be amazing, but until then the only option I have is to review the code and try to spot any potential causes.
@hassy don't know about every time, but certainly 9 times out of 10.
Would love to provide a full test case including back-end, but it's not a public project. Here's the majority of the test plan though - don't know if it'll be much use but hope so :-)
config:
target: "ws://<host-name>/"
phases:
-
duration: 60
arrivalRate: 100
ws:
# Ignore SSL certificate errors
# - useful in *development* with self-signed certs
rejectUnauthorized: false
scenarios:
-
engine: "ws"
flow:
-
loop:
-
send: '{<json-message-1>}'
-
send: '{<json-message-2>}'
-
send: '{<json-message-3>}'
-
send: '{<json-message-4>}'
count: 100
@igorclark Thanks for that. Just to confirm - you don't see any error messages printed to the console at any point during a run that ends up spinning forever?
If you don't mind, next time that happens, does the following command print anything:
pgrep -lfa node | grep worker.js
Hey @hassy - no, no Artillery error messages. There are sometimes a couple of "ECONNRESET" or "connection closed by other party" when the server is getting really hammered, but that's normal, no? And when it ends up just spinning, it doesn't have any other errors, just the report every 10s.
I'll do another run tomorrow and run the pgrep
command to get you some output.
same issue
I set a very height arrivalRate and it seems run forever and never stop.
hello.yml
:
config:
target: 'http://www.example.com'
http:
pool: 10
phases:
- duration: 60
arrivalRate: 2000
defaults:
headers:
Connection: close
scenarios:
- flow:
- get:
url: "/"
Ten minutes later, it's still running,
Thanks @tiaod. Seems like high arrival rates have something to do with it but I was unable to reproduce previously. I'll give it another go though.
@tiaod are you wrapping Artillery in another script? Your issue may be related to #264 then.
@hassy Oh I forgot to mention that. You are right, I use child_proccess spanw to run Artillery. But on a lower arrivalRate (eg: 1000), it could stop normally.
Hi @hassy I have the same issue when testing a socket io server with high arrivalRate. From what I see, there are some connections which never get closed. I think when the server unable to handle that much request, artillery will just wait forever. The sever also seems to ignore these connections while artillery just waits and sends nothing. Is there a mechanism to let artillery close these connections itself instead of waiting for response from server? I would also like to consider these scenarios as failure.
The same thing happens if I don't start the server. Artillery will keep waiting.
@peara Thanks for the info. Artillery does try to time out connections - with an explicit timeout setting for HTTP and the default value used by the underlying library for Socket.io - something to look into to try to solve this. Cheers
Unsure if relevant but thought it worth posting.
Recently I came across this issue as well and it seemed to be issues with target and the scenarios url.
example:
target: 'http://myurlexample.com/'
http:
pool: 10
phases:
- duration: 60
arrivalRate: 10
scenarios:
- flow:
- get:
url: "preview/1/"
resulted in: the NaN issue as described by @igorclark
Report @ 11:40:26(+0100) 2017-09-28
Scenarios launched: 100
Scenarios completed: 0
Requests completed: 0
Concurrent users: 289
RPS sent: NaN
Request latency:
min: NaN
max: NaN
median: NaN
p95: NaN
p99: NaN
Scenario duration:
min: NaN
max: NaN
median: NaN
p95: NaN
p99: NaN
but changing the target to myurlexample.com
and scenarios url to /preview/1/
removed the NaN's from the results.
Summary report @ 11:42:41(+0100) 2017-09-28
Scenarios launched: 600
Scenarios completed: 600
Requests completed: 600
Concurrent users: 60
RPS sent: 9.93
Request latency:
min: 95.5
max: 228.7
median: 108.5
p95: 146.8
p99: 182.9
Scenario duration:
min: 105
max: 303
median: 132
p95: 191.5
p99: 239.5
Scenario counts:
0: 600 (100%)
Codes:
200: 600
We suffered this when using an invalid request URL (literal url
). This issue describes the details https://github.com/shoreditch-ops/artillery/issues/512.
This would explain why prefixing the path worked for @cwthompson, as any URL which is not prefixed by /
is treated as an absolute URL.
Hey @hassy 👋
I'm experiencing the same symptoms.
Artillery: 1.6.0-28 Node.js: v12.2.0 OS: darwin/x64
My config looks something like:
config:
target: "https://example.com"
phases:
- duration: 60
arrivalRate: 20
defaults:
headers:
x-api-key: 'XXXXXXX'
scenarios:
flow:
- get:
url: "/example"
qs:
foo: "bar"
The issue is intermittent (occurring ~2/3rds of the time). Below are two runs, one run immediately after the other with no config changes.
Run no. 1 (error)
$ artillery run example.yml
Started phase 0, duration: 60s @ 16:57:05(+0100) 2019-05-20
Report @ 16:57:15(+0100) 2019-05-20
Elapsed time: 10 seconds
Scenarios launched: 199
Scenarios completed: 0
Requests completed: 0
RPS sent: 20.1
Request latency:
min: NaN
max: NaN
median: NaN
p95: NaN
p99: NaN
Run no. 2 (success)
$ artillery run example.yml
Started phase 0, duration: 60s @ 16:57:17(+0100) 2019-05-20
Report @ 16:57:27(+0100) 2019-05-20
Elapsed time: 10 seconds
Scenarios launched: 199
Scenarios completed: 86
Requests completed: 86
RPS sent: 20.12
Request latency:
min: 124.4
max: 202.6
median: 137.4
p95: 182.5
p99: 201.6
Codes:
200: 86
If I get to the bottom of it I'll post here. If you have any ideas about what I might be doing wrong, or if I can provide any more information please let me know!
Thanks
I seem to be seeing the same issue, but have a bit more info that might be useful. First, see this screenshot, showing the issue with the hung test:
The log output on the service has stopped, all the received requests have completed successfully. The test runner itself is in the hung state and sending no more requests (the test should have ended at 1 minute).
The interesting part is that, when I kill the test runner, it seems like one last small batch of requests is fired off just before artillery exits:
Notice the last two requests in the server log, that are received immediately upon killing artillery. You can also see that these requests are then immediately aborted (the finished=false
implying that the connection was closed on these requests before a response was sent, likely because artillery exited).
My guess would be that artillery is hung because it's waiting for these requests, but the requests were never actually sent. And then killing the process causes them to somehow get flushed out and finally sent.
This is so annoying issue :sob:
I have artillery for stress tests and I want to find a limits of server - and artillery found it, but I am not able to get some reasonable output from artillery due this issue.
My setup - I run this 2x parallel in docker containers to avoid problems with CPU overload.
config:
target: "ws://raspberry:3000/ws?templateId=1&username=JakeTheDog"
phases:
- duration: 120
arrivalRate: 30
name: "Stress creating connection"
processor: "./utils/elasticsearch.js"
scenarios:
- engine: "ws"
flow:
- send: "ping pong"
- think: 1
The problem begins when the server begins to have data processing problems and starts to have timeouts and eventually crashes. After that, artillery starts show
Report @ 14:05:46(+0000) 2020-01-17
Elapsed time: 11 minutes, 10 seconds
Scenarios launched: 0
Scenarios completed: 0
Requests completed: 0
RPS sent: NaN
Request latency:
min: NaN
max: NaN
median: NaN
p95: NaN
p99: NaN
Any solution for that? It is so annoying due I send results from artillery to elasticsearch and now, it doesn't send anything.
@hassy - Can you provide some information about this issue? Without it, it doesn't make any sense to use it :man_shrugging:
So please provide some information, if you care about that or this will be still open and we should look for another tool for stress testing.
It's an extremely difficult issue to reproduce (and to create a reproducible test case for). In your case specifically @vojtech-cerveny, there's nothing unusual about the test script itself, so the issue is somewhere else, but there's no way to try to isolate what it might be unless you can provide more information.
I can confirm that the issue is consistently reproducible with high arrival rate.
Artillery: 1.6.0-29
Artillery Pro: not installed (https://artillery.io/pro)
Node.js: v12.14.1
OS: linux/x64
The higher arrival rate, the longer it takes from Artillery to finish and print a report. E.g. with timeout = 30s it succeeds to finish in 30s with arrival rate = 10, but it takes 60s for arrival rate = 500
I'm running it on a pretty beefy machine that has a lot of free CPU, yet I see CPU usage warnings
@whitebyte If Artillery still completes successfully, that's not an instance of the issue. It would make sense that the test takes a longer time to complete if you're creating more virtual users (50 times more if you're going from 10 to 500/second).
CPU warnings-wise - that also still makes sense. Artillery uses one CPU by default, so you're likely maxing a CPU, especially with a high arrival rate (each arrival is a whole new TCP connection by default).
I'm using only one phase with a constant arrival rate. After some arrival rate threshold Artillery starts working indefinitely, eventually spitting NaNs
Turns out that elapsed time is ignored during the run and only # of scenarios launched is taken into account. It looks like Artillery calculates the required number of scenarios to be launched before start, and then just sticks to this number, ignoring the actual time passed. Which is OK, but should be documented somewhere.
Hi, I am running artillery too for load testing.
Artillery: 1.6.1
Node.js: v10.15.3
OS: linux/x64
I am getting the same issue. Did anyone able to resolve this issue? If yes, please share. I am attaching the screens from my trial run. Here's the config I used.
config:
target: "someURL"
ensure:
maxErrorRate: 1
phases:
- duration: 10
arrivalRate: 5
rampTo: 10
name: "Warming up"
- duration: 20
arrivalRate: 10
rampTo: 50
name: "Max load"
scenarios:
- engine: "ws"
flow:
- send: "hello"
- think: 3
- send: "how are you?"