clearly icon indicating copy to clipboard operation
clearly copied to clipboard

Possible to gracefully recover from StatusCode.RESOURCE_EXHAUSTED?

Open cold-logic opened this issue 2 years ago • 1 comments

To give a little context, I have a couple of Celery tasks. One that processes and aggregates chunks of data and another that gets called when all of the data has been aggregated (my callback task). When my callback task gets called with all of the data as the task payload, Clearly chokes and stops listening for events. Are there any settings or flags I can pass in to gracefully ignore or recover and continue monitoring the tasks?

19:52:52.926  SUCCESS 0 v2.tasks.process_data.create_data_processors d4add4a5-1396-4697-8ff2-52cc3c21e681
19:52:52.926  STARTED 0 v2.tasks.process_data.chunk_processor ba383a3d-a7be-4044-b2bd-e0f63bd510b4
19:52:52.928 RECEIVED 0 v2.tasks.process_data.chunk_processor ab7741dc-cd3e-494d-929d-169bbc6c164b
19:52:52.929 RECEIVED 0 v2.tasks.process_data.chunk_processor ddae671f-ccbf-4b18-9b29-c69c20e9617a
19:52:53.012  STARTED 0 v2.tasks.process_data.chunk_processor 405ffbb1-4316-4107-8386-4cf6f53879fc
19:52:53.014 RECEIVED 0 v2.tasks.process_data.chunk_processor d4cc2c42-e5ca-4309-b533-5e65a5980ee7
19:52:53.114  STARTED 0 v2.tasks.process_data.chunk_processor 1bb463af-74de-46d7-ba25-a85bca290914
19:52:53.116 RECEIVED 0 v2.tasks.process_data.chunk_processor b7b0ddd3-a767-4cc2-ad76-a08c1a969147
19:52:53.222  STARTED 0 v2.tasks.process_data.chunk_processor ab7741dc-cd3e-494d-929d-169bbc6c164b
19:52:53.223  STARTED 0 v2.tasks.process_data.chunk_processor ddae671f-ccbf-4b18-9b29-c69c20e9617a
19:52:53.224  STARTED 0 v2.tasks.process_data.chunk_processor d4cc2c42-e5ca-4309-b533-5e65a5980ee7
19:52:53.228  STARTED 0 v2.tasks.process_data.chunk_processor b7b0ddd3-a767-4cc2-ad76-a08c1a969147
19:52:56.302  SUCCESS 0 v2.tasks.process_data.chunk_processor b7b0ddd3-a767-4cc2-ad76-a08c1a969147
19:52:59.080  SUCCESS 0 v2.tasks.process_data.chunk_processor ba383a3d-a7be-4044-b2bd-e0f63bd510b4
19:52:59.136  SUCCESS 0 v2.tasks.process_data.chunk_processor 405ffbb1-4316-4107-8386-4cf6f53879fc

# --> my callback task happened here

Server communication error: Received message larger than max (4764885 vs. 4194304) (StatusCode.RESOURCE_EXHAUSTED)

cold-logic avatar Oct 06 '21 20:10 cold-logic

Hey, clearly does not have any StatusCode... After a little search, it seems this comes from gRPC, the communication framework from Google. Take a look at this ticket: https://github.com/tensorflow/serving/issues/1382

I can include those options here, it'll probably solve your issue, but I don't much free time at this moment. Maybe you could fix it? Send a PR and I'll review/merge if you can. 👍

rsalmei avatar Oct 08 '21 02:10 rsalmei