EventStore-Client-NodeJS icon indicating copy to clipboard operation
EventStore-Client-NodeJS copied to clipboard

Writes failing under high load with Error: 8 RESOURCE_EXHAUSTED: Bandwidth exhausted and Error: 13 INTERNAL: Received RST_STREAM with code 2 (Internal server error)

Open Lougarou opened this issue 3 years ago • 5 comments

Reproduction steps:

  1. Setup the test client: https://github.com/EventStore/test-client-node
  2. Setup an EventStore 20.10.2 Cluster with 2 vCPUs and 4 GB of memory similar to a C4 on EventStore Cloud.
  3. Run the following command: yarn start wrfl --connection-string="esdb+discover://admin:changeit@uri:2113" --client_count=10 --request_count=30000 --stream_count=5000 --size=10 --worker_count=1

Note this has been tested on a t2.micro instance on AWS.

WORKER 0 TOTALS: 30000 WRITES IN 31262.554757ms (959.6144727513895/s) CLIENT 0: 3000 WRITES IN 22811.676039ms (131.51159936126777/s) CLIENT 1: 3000 WRITES IN 21562.335408ms (139.13149680840917/s) CLIENT 2: 3000 WRITES IN 24160.856734ms (124.16778233605828/s) CLIENT 3: 3000 WRITES IN 24975.396275ms (120.1182142204068/s) CLIENT 4: 3000 WRITES IN 25664.61686ms (116.89245221796777/s) CLIENT 5: 3000 WRITES IN 26430.558514ms (113.50497941278579/s) CLIENT 6: 3000 WRITES IN 27084.961592ms (110.76257168797687/s) CLIENT 7: 3000 WRITES IN 28198.875177ms (106.38722222675413/s) CLIENT 8: 3000 WRITES IN 29487.788001ms (101.73703093288188/s) CLIENT 9: 3000 WRITES IN 30235.139748ms (99.22229647370638/s) DONE TOTAL 30000 WRITES IN 31597.491925ms (949.4424453437059/s) SUCCESS: 7717 FAILURE: 22283 failures: 21685x | Error: 13 INTERNAL: Received RST_STREAM with code 2 (Internal server error) 598x | Error: 8 RESOURCE_EXHAUSTED: Bandwidth exhausted Done in 32.99s.

Lougarou avatar Jun 01 '21 14:06 Lougarou

I'm seeing the same error on a recently upgraded instance on EventStore Cloud (20.10.2).

juanchristensen avatar Jun 02 '21 16:06 juanchristensen

Update: issue is reproducible on a non-EventStore-Cloud instance as well and single node. Version is still 20.10.2 LTS.

Lougarou avatar Jun 04 '21 12:06 Lougarou

Workaround is to limit the number of open requests at any point in time. For example, setting max_in_flight to 500 on the test client: yarn start wrfl --connection-string="esdb://admin:changeit@uri" --client_count=10 --request_count=30000 --stream_count=5000 --size=10 --max_in_flight=500 The correct value will depend on the use case.

Lougarou avatar Jun 04 '21 13:06 Lougarou

Error: 8 RESOURCE_EXHAUSTED: Bandwidth exhausted

Related issue on grpc-node https://github.com/grpc/grpc-node/issues/1158

note: setting grpc-node.max_session_memory didn't appear to fix the issue.

Error: 13 INTERNAL: Received RST_STREAM with code 2 (Internal server error)

Related issue on grpc-node https://github.com/grpc/grpc-node/issues/1532

George-Payne avatar Jun 04 '21 13:06 George-Payne

@Lougarou

This appears to be fixed by #305 as it now correctly applies backpressure, preventing the bandwidth exhaustion.

yarn start wrfl --connection-string="esdb+discover://admin:changeit@uri:2113" --client_count=10 --request_count=30000 --stream_count=5000 --size=10 --worker_count=1

Using batch append:

WORKER 0 TOTALS:  30000 WRITES IN 5678.69381499989ms (5282.905009028134/s)
       CLIENT 0:  3000 WRITES IN 5670.0697149999905ms (529.0940236702196/s)
       CLIENT 1:  3000 WRITES IN 5626.052846000064ms (533.2335266158938/s)
       CLIENT 2:  3000 WRITES IN 5602.990435000043ms (535.4283636216803/s)
       CLIENT 3:  3000 WRITES IN 5592.101767000044ms (536.4709236343868/s)
       CLIENT 4:  3000 WRITES IN 5570.788057999918ms (538.5234492437486/s)
       CLIENT 5:  3000 WRITES IN 5549.308671999956ms (540.6078806062897/s)
       CLIENT 6:  3000 WRITES IN 5534.745611000108ms (542.030331807411/s)
       CLIENT 7:  3000 WRITES IN 5522.03554299986ms (543.2779229034521/s)
       CLIENT 8:  3000 WRITES IN 5512.27463999996ms (544.2399364919927/s)
       CLIENT 9:  3000 WRITES IN 5505.814447999932ms (544.878514947011/s)
DONE TOTAL 30000 WRITES IN 5899.117489000084ms (5085.506443284125/s)
SUCCESS: 30000 FAILURE: 0

Forcing single append:

WORKER 0 TOTALS:  30000 WRITES IN 10455.60041499976ms (2869.275680903179/s)
       CLIENT 0:  3000 WRITES IN 5696.8384480001405ms (526.6078768747851/s)
       CLIENT 1:  3000 WRITES IN 6648.63874100009ms (451.2201845920627/s)
       CLIENT 2:  3000 WRITES IN 7097.50954400003ms (422.6834752953715/s)
       CLIENT 3:  3000 WRITES IN 6034.138383000158ms (497.1712296906933/s)
       CLIENT 4:  3000 WRITES IN 7518.423061000183ms (399.019844408823/s)
       CLIENT 5:  3000 WRITES IN 7988.667791000102ms (375.5319508191027/s)
       CLIENT 6:  3000 WRITES IN 8465.853178000078ms (354.3647565015653/s)
       CLIENT 7:  3000 WRITES IN 8929.67375899991ms (335.9585222222036/s)
       CLIENT 8:  3000 WRITES IN 10094.036437999923ms (297.205188273962/s)
       CLIENT 9:  3000 WRITES IN 9557.863628000021ms (313.87767358506966/s)
DONE TOTAL 30000 WRITES IN 10591.7202569996ms (2832.401089914957/s)
SUCCESS: 30000 FAILURE: 0

--client_count=1, forcing single append:

WORKER 0 TOTALS:  30000 WRITES IN 10676.161729000043ms (2809.998645722078/s)
DONE TOTAL 30000 WRITES IN 10812.09885499999ms (2774.6694145444926/s)
SUCCESS: 30000 FAILURE: 0

George-Payne avatar Aug 11 '22 09:08 George-Payne