executorch Remove no-op _clone_dim

Clean up _clone_dim_order ops in the graph which don't change dim order.

cc @JacobSzwejbka @angelayi

Nov 11 '25 17:11 GregoryComer

You can use RemoveCloneOpsTransform (https://github.com/pytorch/executorch/blob/main/backends/transforms/remove_clone_ops.py) to remove extra clone and dim order clone

Nov 13 '25 17:11 Gasoonjia

You can use RemoveCloneOpsTransform (https://github.com/pytorch/executorch/blob/main/backends/transforms/remove_clone_ops.py) to remove extra clone and dim order clone

Thanks. Do you have any objection to adding this to the to_edge flow?

Nov 13 '25 17:11 GregoryComer

yeah i belive making it a part of to_edge flow might not be a good idea or not enough, since some "unused" clone-like op will be used after to_backend; removing them may introduce some issue. Also there might be some unused clone-like op being introduced during to_backend, only use that in to_edge stage may not be enough.

Nov 13 '25 18:11 Gasoonjia

yeah i belive making it a part of to_edge flow might not be a good idea or not enough, since some "unused" clone-like op will be used after to_backend; removing them may introduce some issue. Also there might be some unused clone-like op being introduced during to_backend, only use that in to_edge stage may not be enough.

Can you elaborate a little more on this? Removing the no-ops before calling the partitioners should be effectively invisible to everything downstream. In to_backend, the partitioners and lowering code are bound to not alter the graph outside of the partition.

My largest concern is actually non-functional custom ops, since this is the only way the optimization could affect the graph semantics. Though, from discussion, this likely isn't a big concern.

Specifically, there is a bit of perf impact currently that I want to mitigate from this. I could add it to the XNNPACK backend (and other backends), but it seems nicer to just optimize graph when going to edge dialect.

Nov 13 '25 19:11 GregoryComer

I see there was some discussion around this in https://github.com/pytorch/executorch/pull/12974. Is that what you're referring to? From my understanding, the risk is specifically around removing clones which modify dim order. So we should be safe so long as we're only stripped no-op clones (no dim order change). Is that in line with your understanding?

Nov 13 '25 19:11 GregoryComer

Solved for XNNPACK, at least, so main issue is done. Still hoping to land the optimization for core.

Dec 02 '25 17:12 GregoryComer

Remove no-op _clone_dim_order ops