pytorch icon indicating copy to clipboard operation
pytorch copied to clipboard

Worse performance than ATen: aten._log_softmax_backward_data

Open IvanYashchuk opened this issue 1 year ago • 0 comments

🐛 Describe the bug

aten._log_softmax_backward_data.default

Here's the result comparing to ATen:

benchmark geomean 20th percentile 50th percentile 80th percentile
HuggingFace 0.94 0.87 0.97 0.99
Torchbench 0.98 0.98 0.98 0.98
TIMM 0.99 0.98 0.99 0.99

Both ATen and nvFuser path are using CUDA Graphs.

Apply this patch first

diff --git a/torch/_prims/context.py b/torch/_prims/context.py
index 203d73fd94..1789775e05 100644
--- a/torch/_prims/context.py
+++ b/torch/_prims/context.py
@@ -254,9 +254,9 @@ def _is_func_unsupported_nvfuser(
 class TorchRefsNvfuserCapabilityMode(TorchRefsMode):
     def __init__(self, *, skip_ops=()):
         aten_ops_to_skip = (
-            "aten._log_softmax.default",
-            "aten._log_softmax_backward_data.default",
-            "aten.expand.default",
+            #"aten._log_softmax.default",
+            #"aten._log_softmax_backward_data.default",
+            #"aten.expand.default",
         )
         self.skip_ops = tuple(skip_ops) + aten_ops_to_skip
         super().__init__(
git clone https://gitlab-master.nvidia.com/iyashchuk/aten_ops_perf.git
cd aten_ops_perf
python aten_ops_perf.py --suite huggingface --dtype float32 --max-samples 100 --op aten._log_softmax_backward_data.default

Check out this gist for the logs: https://gist.github.com/IvanYashchuk/8f433d9512ab1f02a7f960072ba10bb0#file-issue_log_softmax_backward-md

_log_softmax_backward_data is implemented here: https://github.com/pytorch/pytorch/blob/35be73df094f02dd26562cf665a6158e80bc4045/torch/_decomp/decompositions.py#L702-L710

Versions

Checked on upstream master.

IvanYashchuk avatar Nov 03 '22 14:11 IvanYashchuk