xla Enroll EagerOperations to DTensor caching.

Enroll EagerOperations to DTensor caching.

Open copybara-service[bot] opened this issue 2 years ago • 0 comments

Enroll EagerOperations to DTensor caching.

This works around a leak problem when the same Eager operation is executed multiple times.

A more proper fix is to register a notifier_fn when the eager Node finishes execution, and remove the corresponding DTensor created objects at that time. The infra is not there.

This change simplified our code base by ensuring EagerOperation and functions roughly go through the same execution sequence. While individually such branchings may seem benign, these small things do add up to the burden for reasoning the behavior of the code where there is a bug.

The SameShape cached layout policy is also removed. The rationale is it is a bigger ax than the intended case it fixes. The cache introduces implicit long range interaction between user code, which in general makes the system's behavior harder to reason. If we want to fix the layout propagation of Conv grads, then we shall pay the cost and fix it there.

Mar 09 '23 22:03 copybara-service[bot]

xla xla copied to clipboard

Enroll EagerOperations to DTensor caching.

xla
xla copied to clipboard