pytorch icon indicating copy to clipboard operation
pytorch copied to clipboard

fix FakeTensor creation on noncontiguous subclasses

Open bdhirsh opened this issue 10 months ago • 5 comments

Fixes https://github.com/pytorch/pytorch/issues/125287

Fixes https://github.com/pytorch/pytorch/issues/124090, context on the issue

Stack from ghstack (oldest at bottom):

  • #124400
  • -> #124399
  • #124398

cc @mrshenli @pritamdamania87 @zhaojuanmao @satgera @rohan-varma @gqchen @aazzolini @osalpekar @jiayisuse @H-Huang @kwen2501 @awgu @penguinwu @fegin @XilunWu @wanchaol @fduwjj @wz337 @tianyu-l @wconstab @yf225 @chauhang @d4l3k @ezyang @msaroufim @anijain2305 @voznesenskym @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @chenyang78 @kadeng

bdhirsh avatar Apr 18 '24 15:04 bdhirsh

:link: Helpful Links

:test_tube: See artifacts and rendered test results at hud.pytorch.org/pr/124399

Note: Links to docs will display an error until the docs builds have been completed.

:white_check_mark: No Failures

As of commit 0aac00782a8a8c98069f1d406a23cdd0ab176c65 with merge base e16f1ee4cc3e58dbd4755021e4b4b87a16b2aac2 (image): :green_heart: Looks good so far! There are no failures yet. :green_heart:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

pytorch-bot[bot] avatar Apr 18 '24 15:04 pytorch-bot[bot]

Public API sounds okay, but preferably with a clear story around when we'd use this over existing APIs that do similar "forbid_in_autograd" things like .detach() or setting .requires_grad=False. (We also have the private torch._C._functions.DelayedError, which is an out-of-place version of this.) If it is difficult to have such examples, I don't mind keeping it private either for now, but no strong preference.

soulitzer avatar Apr 19 '24 17:04 soulitzer

I updated the PR with @soulitzer's idea to not add a new API: it seems like torch._C._functions.DelayedError should do the job.

Adding that ErrorNode also caused some test failures, which made me realize that autograd.backward is broken in dynamo: https://github.com/pytorch/pytorch/issues/125287. I just fixed it directly in this PR.

bdhirsh avatar May 01 '24 01:05 bdhirsh

Hmm... the errors I'm getting are because y = torch._C._functions.DelayedError(1)(x) don't "propagate" the fact that x is a fake tensor (it returns a plain tensor)

bdhirsh avatar May 01 '24 15:05 bdhirsh

oh it's probably because the SparseTensorImpl C++ subclasses don't override shallow_copy_and_detach properly...

We probably are not able to handle "tensor subclass holding a fake sparse tensor" today for other reasons, so I'm going to leave the sparse fakify logic alone and have it continue using clone for now.

bdhirsh avatar May 01 '24 15:05 bdhirsh