camunda-8-benchmark
camunda-8-benchmark copied to clipboard
PI completed metric no longer updated
Hello,
Whilst trying to use the benchmarking tool, we noticed that the pi_completed metric was never updated. This results in no data being displayed on Grafana for both this metric and pi_cycletime.
Doing some debugging, I found the culprit to be this code section in JobWorker.java:
private void registerWorker(String jobType) {
long fixedBackOffDelay = config.getFixedBackOffDelay();
JobWorkerBuilderStep1.JobWorkerBuilderStep3 step3 = client.newWorker()
.jobType(jobType)
.handler(new SimpleDelayCompletionHandler(false));
if(fixedBackOffDelay > 0) {
step3.backoffSupplier(new FixedBackoffSupplier(fixedBackOffDelay));
}
step3.open();
}
where the new SimpleDelayCompletionHandler(boolean)
is always called with the value false
and thus, these workers never report any completed PIs:
// worker marking completion of process instance via "task-type-complete"
registerWorker(taskType + "-completed");
// worker marking completion of process instance via "task-type-complete"
registerWorker(taskType + "-" + config.getStarterId() + "-completed");
Is this intentional? If so, the Grafana dashboard may need to be updated as multiple graphs are showing up with no data.
Any luck on this @gabortega ?
On the above code snippet for SimpleDelayCompletionHandler
constructor, I in-fact tried passing the flag as true and running the benchmark tool but I don't see any change. The grafana dashboard is still empty and the completed jobs and processes counters are still 0.
Probably we are missing something very basic. I am surprised that the tool doesn't work out-of-the-box. I have just cloned this report and modified the application.properties
to connect with the local zeebe cluster.
Thanks
Hello @shahamit,
We eventually ended up making a fork of this tool and changed much of the code to fit our own needs.
I don't currently have access to the fix I did before our fork so this is based off what I remember:
For the original fix, I changed the signature of registerWorker(String jobType)
to registerWorker(String jobType, boolean flag)
and set the flag to true
for those two workers and false
to all the others.
We also observed that only the default backpressure
strategy would produce the required metrics. Since we wanted to test Zeebe with a fixed throughput and have these metrics, we set all modifiers (i.e., benchmark.startPiReduceFactor
and benchmark.startPiIncreaseFactor
) to 0 and we did not have to change benchmark.maxBackpressurePercentage
(I think). We could roughly set our desired throughput using benchmark.startPiPerSecond
, though we didn't necessarily obtain the exact number set to the property.
Thanks for your inputs @gabortega. While trying to troubleshoot if I check out to an old revision '08dc3ba3' I do see completed jobs counter incremented. Still no data in grafana. Probably the library upgrades done by the bot screwed up the application.
Sorry folks - no time too look into this right now - but happy to accept a PR if you find out the root case. Happy to be pinged again next month and hope to have more availability then :-|
As a workaround you could look a the Zeebe Grafana Dashboard
@falko - Zeebe grafana dashboard has a limitation - It cannot report cycle time for process instances that execute for more than 10 secs. The benchmarking tool dashboard can report it but it probably isn't compatible to report metrics for k8s deployments.
I am adding this here, because the observation fits into the picture. Using the latest image on Kubernetes, I also found that the metrics pi_cycletime and pi_completed were never updated. This didn't change after building the image myself. Also the following lines in the pod's log were completely missing:
PI STARTED: 1022178 (+ 1680) Last minute rate: 27.8
Backpressure: 171815 (+ 138) Last minute rate: 1.9. Percentage: 6.789 %
PI COMPLETED: 914193 (+ 1150) Last minute rate: 20.0. Mean: 126,707. Percentile .95: 132,827. Percentile .99: 143,085
The reason is, that the StatisticsCollector (and probably other classes as well) are not properly initialized. During startup, there are a lot of messages like this:
17:46:41.441 [main] INFO i.c.z.s.c.a.MicrometerMetricsRecorder - Enabling Micrometer based metrics for spring-zeebe (available via Actuator)
17:46:41.441 [main] INFO o.s.c.s.PostProcessorRegistrationDelegate$BeanPostProcessorChecker - Bean 'micrometerMetricsRecorder' of type [io.camunda.zeebe.spring.client.actuator.MicrometerMetricsRecorder] is not eligible for getting processed by all BeanPostProcessors (for example: not eligible for auto-proxying)
The StatisticsCollector was also amongst of them.
I understood that this is related to a cyclic dependency with @Autowire and class initialization during application startup, but for someone who is not into all this Spring and Spring-Boot stuff, the interdependencies are totally intransparent and I don't have a clue, how to fix it for me.
I eventually got the benchmark running by checking out a commit from 31 Mar 2022 (before all these spring stuff updates) and building the image from there. I'd really appreciate a fix for the latest version :-)