OpenACCV-V icon indicating copy to clipboard operation
OpenACCV-V copied to clipboard

Several test cases where `async` is followed by synchronous execution without intervening `wait`

Open tschwinge opened this issue 1 year ago • 6 comments

For example:

https://github.com/OpenACCUserGroup/OpenACCV-V/blob/842a1fb436c3f309a0490cd29be9f24880cfce3e/Tests/acc_delete_finalize_async.c#L113

That exit data doesn't have an async clause, and neither a wait clause to wait for the preceding async(1) code, and also there is no intervening wait directive or acc_wait. Thus -- per my understanding -- that exit data may start executing while the the preceding async(1) code is still in progress, and thus may unmap data that's still in use.

As I've seen such a pattern not only here but also in several other test cases, I wonder if my understanding is flawed, or if all these should be fixed (in some way still to be determined)?

tschwinge avatar Mar 21 '23 10:03 tschwinge

Thank you for bringing this issue to my attention. Based on your description, it seems that the current code may lead to incorrect behavior if the exit data starts executing while the preceding async(1) code is still in progress. This could result in data being unmapped prematurely while it's still in use. We will fix this test and start looking into other async tests that need addressing.

When a PR is being made I will link this issue and close it at the time of the merge. Thank you.

ajarmusch avatar Apr 04 '23 15:04 ajarmusch

If we think that'd be useful, I could also bring this up with the OpenACC Technical Committee for clarification?

tschwinge avatar Apr 04 '23 21:04 tschwinge

Thank you! Were you able to bring this up with he Committee? We want to make sure our implementation is correct. If you get the chance please take a look at the related PR. Thank you for your patience!!

ajarmusch avatar Jun 07 '23 17:06 ajarmusch

Hi, I added a wait clause after the data construct, but I wasn't confident, and upon revisiting this, I think that the test was ok from the beginning. I think that the data construct runs synchronously and only the operations such as data transfers associated with the async clauses are attempted asynchronously. The data construct has a present clause, which makes sure the async(1) copyin of c is done. I might remove the wait clause I added before the copyout.

chrismun avatar Jul 03 '23 18:07 chrismun

OK, so we really need to run this through the OpenACC Technical Committee for clarification; I'll put it onto the agenda. Thanks for your patience.

tschwinge avatar Jul 13 '23 08:07 tschwinge

Consensus from the technical call on 7/13/23 is that the wait is necessary.

There was a discussion of whether because the data region isn't on queue 1 a wait is also required before the data region.

jefflarkin avatar Jul 13 '23 14:07 jefflarkin