onediff Fallback to torch eager execution if the shapes don't match with the compiled graph

Is there an easy way to fallback to torch eager execution if the shapes don't match the oneflow compiled one? Torch provides torch._dynamo.run which prevents re-compilations so only compiled inputs get through the fast path and everything else falls back to eager/slow torch execution https://pytorch.org/docs/stable/torch.compiler_faq.html#why-are-you-recompiling-in-production

Dec 25 '23 11:12 isidentical

This seems a very attractive feature. There is a way to determine whether the shape has been compiled. and we also can make the module called torch module. So this is possible to do this.

Is there a scenario which you want to fallback to torch eager( because it will be slow).

Dec 26 '23 02:12 strint

Is there a scenario which you want to fallback to torch eager( because it will be slow).

Torch's dynamo.run actually explains it pretty well

In some cases, you may not want unexpected compiles after a program has warmed up. For example, if you are serving production traffic in a latency critical application. For this, TorchDynamo provides an alternate mode where prior compiled graphs are used, but no new ones are generated:

Dec 26 '23 23:12 isidentical

Is there a scenario which you want to fallback to torch eager( because it will be slow).

Torch's dynamo.run actually explains it pretty well

In some cases, you may not want unexpected compiles after a program has warmed up. For example, if you are serving production traffic in a latency critical application. For this, TorchDynamo provides an alternate mode where prior compiled graphs are used, but no new ones are generated:

This is reasonable, we will try to add this feature.

Dec 27 '23 02:12 strint

Since we start to disable cache and us VM to support dynamic shape, so there is no need to fall back to torch now.

https://github.com/siliconflow/onediff/releases/tag/0.12.0

@isidentical

Jan 30 '24 09:01 strint

@strint this still makes sense for us where we are compiling pipelines that might not be officially supported by onediff yet, e.g. for AnimateDiffVideo2Video pipeline, I was able to get it to work but it had an issue where the dynamical number of frames caused it to re-compile every time (and the compiled artifacts weren't cached, e.g. it'd first compile for 12 frames then for 16 and then for 12 again).

I'd love to avoid compilation when possible. The same is true for guidance scale <= 1 (when not doing classifier free guidance).

Feb 09 '24 15:02 isidentical

After we support dynamic shape runs, there is no good way to determine whether a shape is supported or not.

But we have a way to determine whether the inputs number has changed. If the inputs number changes, then the graph architecture should change, so we must do re-compile for a new graph.

But we can make this new compilation fall back to torch execution. Is this what you want? @isidentical

https://github.com/siliconflow/onediff/blob/6b54c0872079dec578ca266b82fbb5f03f6b44b8/src/onediff/infer_compiler/utils/args_tree_util.py#L45

Apr 15 '24 14:04 strint

onediff onediff copied to clipboard

Fallback to torch eager execution if the shapes don't match with the compiled graph

onediff
onediff copied to clipboard