Worse performance than ATen: aten._log_softmax

Open IvanYashchuk opened this issue 1 year ago • 0 comments

🐛 Describe the bug

aten._log_softmax.default

Here's the result comparing to ATen:

benchmark	geomean	20th percentile	50th percentile	80th percentile
HuggingFace	0.91	0.63	0.99	1.21
Torchbench	0.99	0.99	0.99	0.99
TIMM	0.99	0.98	0.99	1.0

Both ATen and nvFuser path are using CUDA Graphs.

Apply this patch first

diff --git a/torch/_prims/context.py b/torch/_prims/context.py
index 203d73fd94..1789775e05 100644
--- a/torch/_prims/context.py
+++ b/torch/_prims/context.py
@@ -254,9 +254,9 @@ def _is_func_unsupported_nvfuser(
 class TorchRefsNvfuserCapabilityMode(TorchRefsMode):
     def __init__(self, *, skip_ops=()):
         aten_ops_to_skip = (
-            "aten._log_softmax.default",
-            "aten._log_softmax_backward_data.default",
-            "aten.expand.default",
+            #"aten._log_softmax.default",
+            #"aten._log_softmax_backward_data.default",
+            #"aten.expand.default",
         )
         self.skip_ops = tuple(skip_ops) + aten_ops_to_skip
         super().__init__(

git clone https://gitlab-master.nvidia.com/iyashchuk/aten_ops_perf.git
cd aten_ops_perf
python aten_ops_perf.py --suite huggingface --dtype float32 --max-samples 100 --op aten._log_softmax.default

Check out this gist for the logs: https://gist.github.com/IvanYashchuk/8f433d9512ab1f02a7f960072ba10bb0

Badly performing samples are:

(512, 50265) dim=1
(8192, 50265) dim=1
(1024, 50265) dim=1
(4096, 50265) dim=1
(2048, 50265) dim=1
(511, 30522) dim=1
(2048, 50005) dim=1
(256, 256008) dim=1
(157, 50257) dim=1
(1024, 50005) dim=1
(256, 128112) dim=1
(64, 128) dim=1
(1024, 50358) dim=1
(508, 50272) dim=1
(511, 50257) dim=1

_log_softmax is implemented here: https://github.com/pytorch/pytorch/blob/35be73df094f02dd26562cf665a6158e80bc4045/torch/_decomp/decompositions.py#L988-L1006

Versions

Checked on upstream master.

Nov 03 '22 14:11 IvanYashchuk

pytorch pytorch copied to clipboard

Worse performance than ATen: aten._log_softmax

🐛 Describe the bug

aten._log_softmax.default

Versions

pytorch
pytorch copied to clipboard