microcks icon indicating copy to clipboard operation
microcks copied to clipboard

Perf/use batching for DailyStatisticsFeeder

Open Apoorva64 opened this issue 9 months ago • 4 comments

Description

  • The DailyStatisticsFeeder does a lot of requests to the database. This PR aims to batch thoses requests

Related issue(s)

Apoorva64 avatar Jun 07 '25 13:06 Apoorva64

Hi @Apoorva64, and thanks for this contribution!

This is well thought, and seems to be well executed however it breaks the "service interaction check" feature we've built on this mechanism (see https://microcks.io/blog/testcontainers-modules-0.3/#1-interaction-checks).

I think it can be fixed as you have a flushToDatabae() method on the feeder than can be used on purpose when checking interactions.

But, before going further, I'd like to have more context on the reason you worked on this one and the limitations you observed:

  • Did you collect some performance numbers?
  • What is the threshold where no batching starts to make things difficult on your use case?
  • Have you tried setting up the index recommendations on the database? (see https://github.com/microcks/community/tree/main/tuning/mongodb-generic)
  • Did the indexes bring you some performance boosts? If no, why?
  • Have you considered using a Virtual Thread instead of a pure OS thread?

Thanks,

lbroudoux avatar Jun 10 '25 08:06 lbroudoux

--> Did you collect some performance numbers? i've run quick perf tests with

    scenarios: {
      browse: {
        executor: 'constant-vus',
        exec: 'browse',
        vus: 20,
        duration: '1m',
      },
      invokeRESTMocks: {
        executor: 'per-vu-iterations',
        exec: 'invokeRESTMocks',
        vus: 80,
        iterations: 20000,
        startTime: '5s',
        maxDuration: '2m',
      },
      }

with a WAIT_TIME of 0.1

Tests on nightly Build without indexes

K6 Run


          /\      |‾‾| /‾‾/   /‾‾/
     /\  /  \     |  |/  /   /  /
    /  \/    \    |     (   /   ‾‾\
   /          \   |  |\  \ |  (‾)  |
  / __________ \  |__| \__\ \_____/ .io

  execution: local
     script: .\benchmark\bench-microcks.js
     output: Prometheus remote write (http://127.0.0.1:9091/api/v1/write)

  scenarios: (100.00%) 2 scenarios, 100 max VUs, 2m35s max duration (incl. graceful stop):
           * browse: 20 looping VUs for 1m0s (exec: browse, gracefulStop: 30s)
           * invokeRESTMocks: 20000 iterations for each of 80 VUs (maxDuration: 2m0s, exec: invokeRESTMocks, startTime: 5s, gracefulStop: 30s)


     ✓ status code should be 200

     â–ˆ REST API Tests

       ✓ pastryCall status is 200
       ✓ eclairCall status is 200
       ✓ eclairXmlCall status is 200
       ✓ eclairXmlCall response is XML
       ✓ millefeuilleCall status is 200

     checks.........................: 100.00% ✓ 245779       ✗ 0
     data_received..................: 144 MB  1.1 MB/s
     data_sent......................: 26 MB   207 kB/s
     dropped_iterations.............: 1551297 12393.702821/s
     group_duration.................: avg=197.22ms min=125.29ms med=191.56ms max=462.81ms p(90)=229.97ms p(95)=249.04ms
     http_req_blocked...............: avg=16.05µs  min=0s       med=0s       max=9.11ms   p(90)=0s       p(95)=0s
     http_req_connecting............: avg=9.85µs   min=0s       med=0s       max=7.74ms   p(90)=0s       p(95)=0s
     http_req_duration..............: avg=50ms     min=2.39ms   med=47.6ms   max=367.4ms  p(90)=61.25ms  p(95)=68.54ms
       { expected_response:true }...: avg=50ms     min=2.39ms   med=47.6ms   max=367.4ms  p(90)=61.25ms  p(95)=68.54ms
     http_req_failed................: 0.00%   ✓ 0            ✗ 201604
     http_req_receiving.............: avg=105.16µs min=0s       med=0s       max=121.6ms  p(90)=507.9µs  p(95)=616µs
     http_req_sending...............: avg=21.18µs  min=0s       med=0s       max=20.07ms  p(90)=0s       p(95)=0s
     http_req_tls_handshaking.......: avg=0s       min=0s       med=0s       max=0s       p(90)=0s       p(95)=0s
     http_req_waiting...............: avg=49.88ms  min=1.78ms   med=47.48ms  max=367.4ms  p(90)=61.11ms  p(95)=68.41ms
     http_reqs......................: 201604  1610.665181/s
     iteration_duration.............: avg=212.17ms min=125.29ms med=192.79ms max=808.52ms p(90)=244.49ms p(95)=313.14ms
     iterations.....................: 50967   407.188212/s
     vus............................: 80      min=20         max=100
     vus_max........................: 100     min=100        max=100


FlameGraph

image In the FlameGraph we can see the methods executeFindMultiInternal and executeFindOneInternal(mongodb calls) which accounts around 20% of the total time

Tests on nightly Build with indexes


          /\      |‾‾| /‾‾/   /‾‾/
     /\  /  \     |  |/  /   /  /
    /  \/    \    |     (   /   ‾‾\
   /          \   |  |\  \ |  (‾)  |
  / __________ \  |__| \__\ \_____/ .io

  execution: local
     script: .\benchmark\bench-microcks.js
     output: Prometheus remote write (http://127.0.0.1:9091/api/v1/write)

  scenarios: (100.00%) 2 scenarios, 100 max VUs, 2m35s max duration (incl. graceful stop):
           * browse: 20 looping VUs for 1m0s (exec: browse, gracefulStop: 30s)
           * invokeRESTMocks: 20000 iterations for each of 80 VUs (maxDuration: 2m0s, exec: invokeRESTMocks, startTime: 5s, gracefulStop: 30s)


     ✓ status code should be 200

     â–ˆ REST API Tests

       ✓ pastryCall status is 200
       ✓ eclairCall status is 200
       ✓ eclairXmlCall status is 200
       ✓ eclairXmlCall response is XML
       ✓ millefeuilleCall status is 200

     checks.........................: 100.00% ✓ 179894      ✗ 0
     data_received..................: 109 MB  868 kB/s
     data_sent......................: 19 MB   152 kB/s
     dropped_iterations.............: 1564446 12497.5697/s
     group_duration.................: avg=270.14ms min=124.49ms med=270.46ms max=482.5ms  p(90)=317.89ms p(95)=332.49ms
     http_req_blocked...............: avg=16.26µs  min=0s       med=0s       max=10.66ms  p(90)=0s       p(95)=0s
     http_req_connecting............: avg=10.31µs  min=0s       med=0s       max=10.66ms  p(90)=0s       p(95)=0s
     http_req_duration..............: avg=68.17ms  min=2.88ms   med=67.67ms  max=386.14ms p(90)=94.88ms  p(95)=103.82ms
       { expected_response:true }...: avg=68.17ms  min=2.88ms   med=67.67ms  max=386.14ms p(90)=94.88ms  p(95)=103.82ms
     http_req_failed................: 0.00%   ✓ 0           ✗ 148588
     http_req_receiving.............: avg=102.88µs min=0s       med=0s       max=22.05ms  p(90)=507.6µs  p(95)=546.29µs
     http_req_sending...............: avg=22.1µs   min=0s       med=0s       max=13.06ms  p(90)=0s       p(95)=0s
     http_req_tls_handshaking.......: avg=0s       min=0s       med=0s       max=0s       p(90)=0s       p(95)=0s
     http_req_waiting...............: avg=68.04ms  min=2.49ms   med=67.54ms  max=385.63ms p(90)=94.75ms  p(95)=103.69ms
     http_reqs......................: 148588  1186.994557/s
     iteration_duration.............: avg=286.95ms min=124.49ms med=273.3ms  max=774.09ms p(90)=334.1ms  p(95)=464.57ms
     iterations.....................: 37678   300.99053/s
     vus............................: 80      min=20        max=100
     vus_max........................: 100     min=100       max=100

                                                                                                                                                                                                                                                                                                                                                               
running (2m05.2s), 000/100 VUs, 37678 complete and 0 interrupted iterations                                                                                                                                                                                                                                                                                    
browse          ✓ [======================================] 20 VUs  1m0s                                                                                                                                                                                                                                                                                        
invokeRESTMocks ✓ [======================================] 80 VUs  2m0s  0035554/1600000 iters, 20000 per VU  

FlameGraph

image MongoDb indexes don't seem to impact performance much

Tests on use-batching-for-metrics without indexes


          /\      |‾‾| /‾‾/   /‾‾/
     /\  /  \     |  |/  /   /  /
    /  \/    \    |     (   /   ‾‾\
   /          \   |  |\  \ |  (‾)  |
  / __________ \  |__| \__\ \_____/ .io

  execution: local
     script: .\benchmark\bench-microcks.js
     output: Prometheus remote write (http://127.0.0.1:9091/api/v1/write)

  scenarios: (100.00%) 2 scenarios, 100 max VUs, 2m35s max duration (incl. graceful stop):
           * browse: 20 looping VUs for 1m0s (exec: browse, gracefulStop: 30s)
           * invokeRESTMocks: 20000 iterations for each of 80 VUs (maxDuration: 2m0s, exec: invokeRESTMocks, startTime: 5s, gracefulStop: 30s)


     ✓ status code should be 200

     â–ˆ REST API Tests

       ✓ pastryCall status is 200
       ✓ eclairCall status is 200
       ✓ eclairXmlCall status is 200
       ✓ eclairXmlCall response is XML
       ✓ millefeuilleCall status is 200

     checks.........................: 100.00% ✓ 664408       ✗ 0
     data_received..................: 367 MB  2.9 MB/s
     data_sent......................: 69 MB   554 kB/s
     dropped_iterations.............: 1467733 11737.233652/s
     group_duration.................: avg=72.5ms   min=39.63ms med=70.83ms max=183.26ms p(90)=86.61ms p(95)=93.09ms
     http_req_blocked...............: avg=24.52µs  min=0s      med=0s      max=26.52ms  p(90)=0s      p(95)=0s
     http_req_connecting............: avg=16.37µs  min=0s      med=0s      max=26.52ms  p(90)=0s      p(95)=0s
     http_req_duration..............: avg=18.03ms  min=2.16ms  med=17.15ms max=371.76ms p(90)=23.78ms p(95)=26.67ms
       { expected_response:true }...: avg=18.03ms  min=2.16ms  med=17.15ms max=371.76ms p(90)=23.78ms p(95)=26.67ms
     http_req_failed................: 0.00%   ✓ 0            ✗ 538287
     http_req_receiving.............: avg=145.28µs min=0s      med=0s      max=42.38ms  p(90)=511.9µs p(95)=1ms
     http_req_sending...............: avg=32.73µs  min=0s      med=0s      max=28.79ms  p(90)=0s      p(95)=0s
     http_req_tls_handshaking.......: avg=0s       min=0s      med=0s      max=0s       p(90)=0s      p(95)=0s
     http_req_waiting...............: avg=17.85ms  min=2.04ms  med=16.97ms max=371.76ms p(90)=23.57ms p(95)=26.43ms
     http_reqs......................: 538287  4304.597833/s
     iteration_duration.............: avg=79.8ms   min=39.63ms med=71.16ms max=817.16ms p(90)=88.88ms p(95)=99.01ms
     iterations.....................: 135340  1082.293035/s
     vus............................: 80      min=20         max=100
     vus_max........................: 100     min=100        max=100

                                                                                                                                                                                                                                                                                                                                                               
running (2m05.0s), 000/100 VUs, 135340 complete and 0 interrupted iterations                                                                                                                                                                                                                                                                                   
browse          ✓ [======================================] 20 VUs  1m0s                                                                                                                                                                                                                                                                                        
invokeRESTMocks ✓ [======================================] 80 VUs  2m0s  0132267/1600000 iters, 20000 per VU                                                                                                                                                                                                                                                

FlameGraph

image

In the FlameGraph we can see the methods executeFindMultiInternal and executeFindOneInternal(mongodb calls) are gone

i'll answer the rest of the questions and provide a more detailled explaination with a reproductable environment during the following week

Apoorva64 avatar Jun 10 '25 21:06 Apoorva64

MongoDb indexes don't seem to impact performance much

Even seems to be worse 🤣 What MongoDB are you using? The one we provide? Executed in docker compose? A remote one? Have you checked indexes on daily statistics collection have been correctly created?

I would also be interested in how you first discovered this problem. What was the use-case you were trying to address? How many services/hits/users make you trigger this exploration and discover this optimisation?

lbroudoux avatar Jun 12 '25 07:06 lbroudoux

I'm using the Mongodb in the compose. I'm currently building a setup with grafana and some dashboard to show the problems. Yes the statistics are created correctly (the tests on it are green)

I didn't go from a specific use-case for this optimisation. I just saw that microcks was too slow and cloud clearly be faster with some little changes xD

Apoorva64 avatar Jun 14 '25 08:06 Apoorva64