r5 icon indicating copy to clipboard operation
r5 copied to clipboard

Randomize origin order in regional analyses

Open abyrd opened this issue 7 years ago • 4 comments

In regional analyses, we currently handle origins in order by row, starting at the upper left corner. If, as in the Netherlands, the upper left corner is mostly water, the regional analysis will proceed much faster at the beginning. This gives an incorrect impression of total run time for a job.

abyrd avatar Nov 16 '17 00:11 abyrd

I've reimplemented the broker to track completed tasks using bitsets. The challenge with randomization of order is that it requires materializing the sequence, which uses more memory. The advantage of more accurate run time prediction is probably not worth the extra complexity of materializing this order list.

abyrd avatar Dec 06 '17 12:12 abyrd

I think you could do a random walk over the bitset and just loop over at the ends. So say you'd jump forward a random amount between 0 and the length of the bitset, then get the next unset bit and enqueue that task.

mattwigway avatar Jun 21 '18 19:06 mattwigway

Thinking ahead to when partial results will be displayed, it could be useful to start at the center of a rectangular grid and spiral outward.

ansoncfit avatar Feb 17 '21 21:02 ansoncfit

Having workers work on blocks of origins adjacent to each other is probably also more cache-efficient as the origins being handled in parallel on the same machine will be using a lot of the same streets and transit routes. The origins could still be distributed randomly to different workers, but the individual blocks of tasks handed to any particular worker could be geographically contiguous.

abyrd avatar Oct 17 '23 09:10 abyrd