Halide Fix #5175: Add an extra check for input dependencies of the async producer

As described in the #5175, it's possible to write an illegal schedule which successfully goes through the code generation, but generated code produces incorrect results.

This PR adds an extra check to detect such illegal schedules by verifying that if in realize node A there is a producer node B and producer A is async and is consumer of B then B must be also scheduled as async.

Aug 20 '20 20:08 vksnk

LGTM but @zvookin should probably review

Aug 20 '20 21:08 steven-johnson

There's something I'm not getting here. Generally the things an async scheduled Func depends on do not need to be async themselves. I'm guessing this has to do with the storage being allocated outside the async invocations and the compute being inside. But then I'm not quite seeing how requiring the dependency to be async always makes it correct. I will sit down and look at it a little more closely, but I think updating the description to say something in terms of which schedules are valid and invalid, rather than describing the IR conditions they result in, might help.

Aug 25 '20 18:08 zvookin

Thank you for reviewing, that's a great question.

Generally the things an async scheduled Func depends on do not need to be async themselves.

Right, but input dependency need to be computed fully beforehand. If input dependency is still being produced, async_producer needs to have a way to know that data it depends on is ready.

If I understand correctly, async producers are lifted into separate "thread" based on their store_level, so if store_level is the same as compute_level then there is sort of implicit synchronization point between input dependency and producer:

for some_dim:
  compute input_dependency
  --> implicit sync point <--
  fork {
    compute async_producer  (depends on input_dependency)
    semaphores to sync between async_producer and consumer
  } {
    semaphores to sync between async_producer and consumer
    compute consumer of async_producer
  }

However, if store level is different from compute level then async producer will be lifted out of the loop and this implicit synchronization point is gone:

fork {
for some_dim:
  compute async_producer  (depends on input_dependency, but can't really know how much of it was produced in another thread) 
  semaphores to sync between async_producer and consumer
} {
for some_dim:
  compute input_dependency
  
  semaphores to sync between async_producer and consumer
  compute consumer of async_producer
}

If this logic is correct then solution can be either making a copy of input_producer in both "threads" or inserting explicit synchronization between input_producer and async_producer. The latter looks exactly like what async() already does, so it seemed to me like a more reasonable way to address it.

(Maybe, it's also possible to do forking differently, so it's not based on store level, but I am not sure and it sounds like a really big change).

Aug 25 '20 20:08 vksnk

Monday review ping.

Aug 31 '20 16:08 vksnk

Is this still just awating review?

Sep 17 '20 17:09 steven-johnson

Is this just waiting on review?

Sep 24 '20 20:09 steven-johnson

Yes, review is still needed.

Also, I recently realized that #5201 might be just a different type of the same issue, so maybe it's possible/worth trying to generalize check from this PR to detect the case from #5201.

Sep 24 '20 21:09 vksnk

Where does this PR stand?

Oct 22 '20 16:10 steven-johnson

I can try to generalize it to cover the other case, but it would be great to get a feedback first in case there is a flaw in the logic somewhere.

Oct 22 '20 17:10 vksnk

This PR is pretty old. Is it still active?

Apr 12 '21 16:04 steven-johnson

This is still a problem which needs fixing, so yes. That being said, I discovered a test which still fails, so this needs to be figured out first. I am planning to return to async in couple weeks for some internal stuff, so hopefully can get this fixed too then.

Apr 12 '21 17:04 vksnk

This is still a problem which needs fixing, so yes. That being said, I discovered a test which still fails, so this needs to be figured out first. I am planning to return to async in couple weeks for some internal stuff, so hopefully can get this fixed too then.

What was the test that still fails? Is it part of this PR?

Sep 28 '23 16:09 abadams

Halide Halide copied to clipboard

Fix #5175: Add an extra check for input dependencies of the async producer

Halide
Halide copied to clipboard