hpx icon indicating copy to clipboard operation
hpx copied to clipboard

Excessive overheads that loop_idx_n calls cancellation_token::was_cancelled per each loop.

Open taeguk opened this issue 7 years ago • 8 comments

https://github.com/STEllAR-GROUP/hpx/blob/master/hpx/parallel/util/loop.hpp#L485 https://github.com/STEllAR-GROUP/hpx/blob/master/hpx/parallel/util/cancellation_token.hpp#L41-L44

Currently, loop_idx_n calls cancellation_token::was_cancelled per each loop. But, It seems to be excessive to check whether token was cancelled. If the cancellation occurs rarely, occasional checking the cancellation will be much better. It's obviously excessive overhead to check for cancellation on every loop. I know that it is very experimental and difficult to determine the cycle to check cancellation. I think that adding a parameter for specifing the cycle can be one of solutions. (default value : 1)

I think this issue is very important because loop_idx_n is now used for parallel algorithms like find, is_heap, etc..

taeguk avatar May 30 '17 15:05 taeguk

@taeguk: I agree. Go for it! In the end this could be exposed through executor parameters (similar to chunk sizes, etc.)

hkaiser avatar May 30 '17 18:05 hkaiser

@taeguk, If we add this we might have to be especially careful with algorithms which roll back the work done before the cancellation_token was triggered (e.g. the uninitialized_...` family of algorithms).

hkaiser avatar Jun 01 '17 00:06 hkaiser

algorithms which roll back the work done before the cancellation_token was triggered

@hkaiser I can't find that case. https://github.com/STEllAR-GROUP/hpx/blob/master/hpx/parallel/util/loop.hpp#L401 In above codes, it seems that 'roll back' is performed by only a thread which cancels token. (https://github.com/STEllAR-GROUP/hpx/blob/master/hpx/parallel/util/loop.hpp#L408-L411)

taeguk avatar Jun 01 '17 03:06 taeguk

@taeguk This happens for instance here: https://github.com/STEllAR-GROUP/hpx/blob/master/hpx/parallel/algorithms/uninitialized_copy.hpp#L99-L132. The partition which is cancelled directly rolls back, all other partitions are rolled back by the 3rd function passed to partitioner_with_cleanup.

However even for those algorithms it might not be a problem if we checked for cancellation not after every iteration. This is just something we need to be careful about.

hkaiser avatar Jun 01 '17 12:06 hkaiser

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

stale[bot] avatar Jul 04 '19 08:07 stale[bot]

This issue has been automatically closed. Please re-open if necessary.

stale[bot] avatar Aug 03 '19 09:08 stale[bot]

I suspect that cancellation tokens and support for them needs to be kept open. One solution to this problem (in a broader sense) would be to only enable cancellation of tasks when designated executors/schedulers are used (executor params as mentioned above for example). Adding checks for cancellation in the lowest level loops introduces overheads that are unnecessary and expensive - only allowing cancellation if the user has requested it beforehand via the executor/scheduler/thread_pool might be a way to address this.

biddisco avatar Aug 05 '19 08:08 biddisco

@biddisco good thinking! This would be yet another property we could add once we have migrated our executors to support those...

hkaiser avatar Aug 05 '19 11:08 hkaiser

This was fixed by #6069

hkaiser avatar Dec 04 '22 15:12 hkaiser