locust icon indicating copy to clipboard operation
locust copied to clipboard

Discarded report from unrecognized worker

Open michie1 opened this issue 1 year ago • 11 comments

Prerequisites

Description

This is a follow up of https://github.com/locustio/locust/issues/3000. The UI now shows the correct number, but only when one worker node is used. When adding --processes 10 and specifying 10 users in the UI I still get a lower number in the UI compared with what the nodejs command is showing.

The logs is saying "locust.runners: ... reported that is has stopped" and later "Discarded report from unrecognized worker ...". This is logged multiple times for multiples runners.

const http = require('http');

let requestCount = 1;

const server = http.createServer((req, res) => {
  setTimeout(() => {
    console.log(requestCount);
    res.writeHead(200, { 'Content-Type': 'text/plain' });
    res.end(`Request count: ${requestCount}`);
    requestCount++; // Increment the request counter
  }, 1000);
});

const PORT = 3000;
server.listen(PORT, () => {
    console.log(`Server running at http://localhost:${PORT}`);
});

Command line

locust -s 10 --processes 10

Locustfile contents

from locust import HttpUser, task

i = 0

class HelloWorldUser(HttpUser):
    @task
    def hello_world(self):
        global i
        print("before", i)
        self.client.get("/")
        print("after", i)
        i += 1


### Python version

3.10

### Locust version

2.32.4

### Operating system

Ubuntu 22.04.5 WSL

michie1 avatar Dec 02 '24 07:12 michie1

image

michie1 avatar Dec 02 '24 07:12 michie1

Same holds when I remove the setTimeout (I thought maybe that would introduce count differences) and specify the Runtime to 10 seconds instead of clicking on the "Stop" button.

(--headless returns the correct count.)

michie1 avatar Dec 02 '24 08:12 michie1

I can reproduce this as well. I wonder when it broke... Anyways, I'm on it...

cyberw avatar Dec 04 '24 09:12 cyberw

Actually, I'm only able to reproduce it with --stop-timeout. I still sometimes get the log message at shutdown without -s, but then no requests are actually missing (so probablly going to drop that log message level to debug). So still a bug, but much smaller scope...

cyberw avatar Dec 04 '24 10:12 cyberw

This issue is stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed in 20 days.

github-actions[bot] avatar Feb 03 '25 02:02 github-actions[bot]

Please remove stale label.

michie1 avatar Feb 06 '25 19:02 michie1

This issue is stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed in 20 days.

github-actions[bot] avatar Apr 09 '25 02:04 github-actions[bot]

Any updates on this - annoying issue? Thanks! It looks like because the report gets discarded then the master disconnects and the worker disconnects?

cogito-locust-master-7c7ccd9686-72gjk cogito-locust [2025-04-28 15:28:33,278] cogito-locust-master-7c7ccd9686-72gjk/INFO/locust.runners: Discarded report from unrecognized worker cogito-locust-worker-7df8cb89cf-kgshz_2fd8598aca3c4042891a5a3382cd202e
cogito-locust-master-7c7ccd9686-72gjk cogito-locust [2025-04-28 15:28:36,280] cogito-locust-master-7c7ccd9686-72gjk/INFO/locust.runners: Discarded report from unrecognized worker cogito-locust-worker-7df8cb89cf-kgshz_2fd8598aca3c4042891a5a3382cd202e
cogito-locust-worker-7df8cb89cf-kgshz cogito-locust [2025-04-28 15:28:37,314] cogito-locust-worker-7df8cb89cf-kgshz/ERROR/locust.runners: Didn't get heartbeat from master in over 60s
cogito-locust-worker-7df8cb89cf-kgshz cogito-locust [2025-04-28 15:28:37,314] cogito-locust-worker-7df8cb89cf-kgshz/INFO/locust.main: Shutting down (exit code 0)
- cogito-locust-worker-7df8cb89cf-kgshz › cogito-locust
cogito-locust-master-7c7ccd9686-72gjk cogito-locust [2025-04-28 15:28:50,978] cogito-locust-master-7c7ccd9686-72gjk/INFO/locust.runners: cogito-locust-worker-7df8cb89cf-kgshz_4168cabe88344bdd861b45900e49f96b (index 1) reported as ready. 1 workers connected.

Some more logs

cogito-locust-master-7c7ccd9686-566sw cogito-locust [2025-04-28 15:52:00,092] cogito-locust-master-7c7ccd9686-566sw/INFO/locust.runners: cogito-locust-worker-7df8cb89cf-blmgw_d93f9801ad8944e794bfbda2e16e8bd8 (index 0) reported that it has stopped
cogito-locust-master-7c7ccd9686-566sw cogito-locust [2025-04-28 15:52:01,320] cogito-locust-master-7c7ccd9686-566sw/INFO/locust.runners: Discarded report from unrecognized worker cogito-locust-worker-7df8cb89cf-blmgw_d93f9801ad8944e794bfbda2e16e8bd8
cogito-locust-master-7c7ccd9686-566sw cogito-locust [2025-04-28 15:52:04,322] cogito-locust-master-7c7ccd9686-566sw/INFO/locust.runners: Discarded report from unrecognized worker cogito-locust-worker-7df8cb89cf-blmgw_d93f9801ad8944e794bfbda2e16e8bd8

Basically I hit stop on the UI - I would not expect that to result in a worker restart?

frankjkelly avatar Apr 28 '25 15:04 frankjkelly

Can you reproduce this consistently with nothing in special in your locustfile? Are you using --stop-timeout? Are you using docker?

Worker restart on stop test is weird and something we wanna fix, but it probably unrelated.

cyberw avatar Apr 28 '25 19:04 cyberw

@cyberw Yes - see this consistently. I guess one thing that is special is we are using the SocketIOUser https://github.com/SvenskaSpel/locust-plugins/blob/master/locust_plugins/users/socketio.py (A slightly modified one but nothing magical to my newbie eye at least)

We are deployed on Kubernetes with a Master and a Worker.

Dockerfile below

# Example from https://github.com/bneijt/poetry-lock-docker/
FROM python:3.10.16-slim

WORKDIR /project

COPY dist/*.whl /project/

ARG PIP_INDEX_URL

# Needed for fastavro
RUN apt-get update 
RUN apt-get -y install gcc

RUN pip install --no-cache-dir *.whl  \
    && rm -rf *.whl

# Temp fix for CVE-2021-46848
RUN apt-get --only-upgrade install -y libexpat1 libtasn1-6

USER 1001
COPY locust/tasks/ /locust-tasks/

#From https://github.com/locustio/locust/blob/master/Dockerfile
EXPOSE 8089 5557
# turn off python output buffering
ENV PYTHONUNBUFFERED=1
ENTRYPOINT ["locust"]

Using Flux we update the command to be

        - op: replace
          path: "/spec/values/master/deployment/command"
          value: ["locust", "-f", "/locust-tasks/locustFile.py", "--master", "--class-picker", "--web-port", '"8089"']

Worker deploys with this

command: ["locust", "-f", "/locust-tasks/locustFile.py", "--worker", "--master-host=cogito-locust-master"]

frankjkelly avatar Apr 28 '25 19:04 frankjkelly

Have you checked cpu usage and ensured pods arent restarting or anything? Can you reproduce at low load (like a single User)

cyberw avatar Apr 28 '25 21:04 cyberw

This issue is stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed in 20 days.

github-actions[bot] avatar Jun 28 '25 02:06 github-actions[bot]

This issue was closed because it has been marked stale for 20 days with no activity. This does not necessarily mean that the issue is bad, but it most likely means that nobody is willing to take the time to fix it. If you have found Locust useful, then consider contributing a fix yourself!

github-actions[bot] avatar Jul 18 '25 02:07 github-actions[bot]