appengine-mapreduce icon indicating copy to clipboard operation
appengine-mapreduce copied to clipboard

processing_rate is a multiplier and not absolute number

Open MeLight opened this issue 9 years ago • 5 comments

When I pass "processing_rate":1 as part of mapper_params and examine the logs of /mapreduce/worker_callback I see that each worker callback processes 8 entities each time. If I set "processing_rate":2 each callback will process 16 entities. On another project I've worked on, the numbers were 15 and 30 (for processing_rate of 1 and 2). So I conclude that processing_rate param is a multiplier.

  • Where can I set the actual value?
  • What makes it change from project to project?

MeLight avatar Oct 14 '15 11:10 MeLight

processing_rate really should be removed. It's a throttle to restrict the rate items are processed. If you don't want to artificially slow things down, then don't set it.

tkaitchuck avatar Oct 16 '15 21:10 tkaitchuck

So there's no way of knowing how many entities each shard will process?

MeLight avatar Oct 16 '15 21:10 MeLight

It will process as many as it can in the configured slice interval. Is there some reason you want to restrict it?

tkaitchuck avatar Oct 22 '15 05:10 tkaitchuck

I want to restrict the rate to entity-per-task so that it is easier to analyze the logs and track errors.

MeLight avatar Oct 25 '15 09:10 MeLight

Then, yes, that is the setting you want. All changes to the value should produce a linear change in the rate.

tkaitchuck avatar Nov 01 '15 00:11 tkaitchuck