Worse performance than ATen: aten._log_softmax_backward_data

Open IvanYashchuk opened this issue 1 year ago • 0 comments

🐛 Describe the bug

aten._log_softmax_backward_data.default

Here's the result comparing to ATen:

benchmark	geomean	20th percentile	50th percentile	80th percentile
HuggingFace	0.94	0.87	0.97	0.99
Torchbench	0.98	0.98	0.98	0.98
TIMM	0.99	0.98	0.99	0.99

Both ATen and nvFuser path are using CUDA Graphs.

Apply this patch first

diff --git a/torch/_prims/context.py b/torch/_prims/context.py
index 203d73fd94..1789775e05 100644
--- a/torch/_prims/context.py
+++ b/torch/_prims/context.py
@@ -254,9 +254,9 @@ def _is_func_unsupported_nvfuser(
 class TorchRefsNvfuserCapabilityMode(TorchRefsMode):
     def __init__(self, *, skip_ops=()):
         aten_ops_to_skip = (
-            "aten._log_softmax.default",
-            "aten._log_softmax_backward_data.default",
-            "aten.expand.default",
+            #"aten._log_softmax.default",
+            #"aten._log_softmax_backward_data.default",
+            #"aten.expand.default",
         )
         self.skip_ops = tuple(skip_ops) + aten_ops_to_skip
         super().__init__(

git clone https://gitlab-master.nvidia.com/iyashchuk/aten_ops_perf.git
cd aten_ops_perf
python aten_ops_perf.py --suite huggingface --dtype float32 --max-samples 100 --op aten._log_softmax_backward_data.default

Check out this gist for the logs: https://gist.github.com/IvanYashchuk/8f433d9512ab1f02a7f960072ba10bb0#file-issue_log_softmax_backward-md

_log_softmax_backward_data is implemented here: https://github.com/pytorch/pytorch/blob/35be73df094f02dd26562cf665a6158e80bc4045/torch/_decomp/decompositions.py#L702-L710

Versions

Checked on upstream master.

Nov 03 '22 14:11 IvanYashchuk

pytorch pytorch copied to clipboard

Worse performance than ATen: aten._log_softmax_backward_data

🐛 Describe the bug

aten._log_softmax_backward_data.default

Versions

pytorch
pytorch copied to clipboard