pytorch Fix mem size mismatch from split/chunk in const folding

Summary: The chunk/split ops on the weights/constants is folded in a fx pass and each output tensor has the same storage size of the original tensor (which is 3x of its actual size if chunk(3)). However Backend calculates the mem size on device from tensor shape/stride/dtype. This causes the mismatch when copying weights/constants to device as allocated mem on device is always smaller than the size of weights/constants and results in a runtime error in loading weight/constant (T172125529).

This diff fixes the issue by cloning the tensors after const folding so that the tensors has correct storage size.

Test Plan: Before this change: (18432 = 48 * 64 * 2 * 3)

RuntimeError: Failed to load constant getitem_idx0 split (remaining=18432) at fbcode/caffe2/torch/fb/acc_runtime/afg/afg_bindings.cpp:3422: Request failed because an invalid parameter

buck2 run mode/opt //caffe2/torch/fb/acc_runtime/afg/tests:test_operators-artemis -- -r test_mem_size_mismatch

Ran 1 test in 7.048s

OK

Reviewed By: jfix71

Differential Revision: D56663931

Apr 29 '24 23:04 trieuat

:link: Helpful Links

:test_tube: See artifacts and rendered test results at hud.pytorch.org/pr/125199

:page_facing_up: Preview Python docs built from this PR
:page_facing_up: Preview C++ docs built from this PR
:question: Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

:x: 1 New Failure

As of commit 9e9ccf65bde1cb4104450b76bcb7cd277325076f with merge base 2c8237c6aa32ab7470e76bde03f2d3dcb9dd42a1 ():

NEW FAILURE - The following job has failed:

trunk / win-vs2019-cpu-py3 / test (default, 1, 3, windows.4xlarge.nonephemeral) (gh) The action 'Test' has timed out after 210 minutes.

This comment was automatically generated by Dr. CI and updates every 15 minutes.

Apr 29 '24 23:04 pytorch-bot[bot]

@pytorchbot merge

Apr 30 '24 23:04 trieuat

This PR needs to be approved by an authorized maintainer before merge.

Apr 30 '24 23:04 pytorch-bot[bot]

This pull request was exported from Phabricator. Differential Revision: D56663931

May 01 '24 17:05 facebook-github-bot

This pull request was exported from Phabricator. Differential Revision: D56663931

May 02 '24 22:05 facebook-github-bot

@pytorchbot merge -f 'Landed internally'

(Initiating merge automatically since Phabricator Diff has merged, using force because this PR might not pass merge_rules.json but landed internally)

May 03 '24 04:05 facebook-github-bot

Merge started

Your change will be merged immediately since you used the force (-f) flag, bypassing any CI checks (ETA: 1-5 minutes). Please use -f as last resort and instead consider -i/--ignore-current to continue the merge ignoring current failures. This will allow currently pending tests to finish and report signal before the merge.

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status here

May 03 '24 04:05 pytorchmergebot

pytorch pytorch copied to clipboard

Fix mem size mismatch from split/chunk in const folding

:link: Helpful Links

:test_tube: See artifacts and rendered test results at hud.pytorch.org/pr/125199

:x: 1 New Failure

Merge started

pytorch
pytorch copied to clipboard