MONAI icon indicating copy to clipboard operation
MONAI copied to clipboard

Updates for Pytorch 2.7

Open ericspod opened this issue 7 months ago • 13 comments

Description

This will update MONAI to be compatible with PyTorch 2.7. There appear to be few code changes with this release so hopefully this will be simply a matter of updating versions.

Types of changes

  • [x] Non-breaking change (fix or new feature that would not break existing functionality).
  • [ ] Breaking change (fix or new feature that would cause existing functionality to change).
  • [ ] New tests added to cover the changes.
  • [ ] Integration tests passed locally by running ./runtests.sh -f -u --net --coverage.
  • [ ] Quick tests passed locally by running ./runtests.sh --quick --unittests --disttests.
  • [ ] In-line docstrings updated.
  • [ ] Documentation updated, tested make html command in the docs/ folder.

ericspod avatar Apr 25 '25 11:04 ericspod

It's possible the CPU provided by the Windows runner is too old for PyTorch 2.7 which may now require instructions it doesn't have.

ericspod avatar Apr 25 '25 13:04 ericspod

The issue with Windows appears to be related to float 64 calculations, specifically with RandRotate in tests\integration\test_pad_collation.py. This doesn't appear to be pad collation related and goes away if float 32 is used as the dtype. I'm investigating further.

ericspod avatar Apr 25 '25 22:04 ericspod

I think I've traced the issue to apparently a bug in grid_sample, I've raised an issue here on the PyTorch repo.

ericspod avatar Apr 28 '25 23:04 ericspod

I think I've traced the issue to apparently a bug in grid_sample, I've raised an issue here on the PyTorch repo.

Thank you for looking into this! Instead of waiting for a fix from PyTorch, do you think it's possible to implement a workaround by altering the dtype for Windows operating systems?

KumoLiu avatar Apr 29 '25 13:04 KumoLiu

do you think it's possible to implement a workaround by altering the dtype for Windows operating systems?

I'm looking into that now and will hopefully have something soon. We may have to convert to float32 and back in places so we may have knock-on precision issues.

ericspod avatar Apr 29 '25 13:04 ericspod

We may need waiting for the release from torch-tensorrt to support PyTorch2.7. https://pypi.org/project/torch-tensorrt/#history

KumoLiu avatar Apr 30 '25 07:04 KumoLiu

Hi @KumoLiu this got through the Windows tests now. I raised the issue with PyTorch so hopefully version 2.7.1 will resolve the issue, in the meantime we can run the blossom tests and discuss whether to merge this.

ericspod avatar May 01 '25 23:05 ericspod

/build

KumoLiu avatar May 02 '25 04:05 KumoLiu

Error log:

[2025-05-02T05:07:25.738Z]   Attempting uninstall: setuptools
[2025-05-02T05:07:25.738Z]     Found existing installation: setuptools 45.2.0
[2025-05-02T05:07:25.738Z]     Uninstalling setuptools-45.2.0:
[2025-05-02T05:07:25.738Z]       Successfully uninstalled setuptools-45.2.0
[2025-05-02T05:08:04.452Z] ERROR: Exception:
[2025-05-02T05:08:04.452Z] Traceback (most recent call last):
[2025-05-02T05:08:04.452Z]   File "/usr/lib/python3.9/py_compile.py", line 144, in compile
[2025-05-02T05:08:04.452Z]     code = loader.source_to_code(source_bytes, dfile or file,
[2025-05-02T05:08:04.452Z]   File "<frozen importlib._bootstrap_external>", line 918, in source_to_code
[2025-05-02T05:08:04.452Z]   File "<frozen importlib._bootstrap>", line 228, in _call_with_frames_removed
[2025-05-02T05:08:04.452Z]   File "/usr/local/lib/python3.9/dist-packages/pytype/tools/merge_pyi/test_data/parse_error.py", line 2
[2025-05-02T05:08:04.452Z]     def f(*): pass
[2025-05-02T05:08:04.452Z]            ^
[2025-05-02T05:08:04.452Z] SyntaxError: named arguments must follow bare *
[2025-05-02T05:08:04.452Z] 
[2025-05-02T05:08:04.452Z] During handling of the above exception, another exception occurred:
[2025-05-02T05:08:04.452Z] 
[2025-05-02T05:08:04.452Z] Traceback (most recent call last):
[2025-05-02T05:08:04.452Z]   File "/usr/lib/python3.9/compileall.py", line 238, in compile_file
[2025-05-02T05:08:04.452Z]     ok = py_compile.compile(fullname, cfile, dfile, True,
[2025-05-02T05:08:04.452Z]   File "/usr/lib/python3.9/py_compile.py", line 150, in compile
[2025-05-02T05:08:04.452Z]     raise py_exc
[2025-05-02T05:08:04.452Z] py_compile.PyCompileError:   File "/usr/local/lib/python3.9/dist-packages/pytype/tools/merge_pyi/test_data/parse_error.py", line 2
[2025-05-02T05:08:04.452Z]     def f(*): pass
[2025-05-02T05:08:04.452Z]            ^
[2025-05-02T05:08:04.452Z] SyntaxError: named arguments must follow bare *
[2025-05-02T05:08:04.452Z] 
[2025-05-02T05:08:04.452Z] 
[2025-05-02T05:08:04.452Z] During handling of the above exception, another exception occurred:
[2025-05-02T05:08:04.452Z] 
[2025-05-02T05:08:04.452Z] Traceback (most recent call last):
[2025-05-02T05:08:04.452Z]   File "/usr/local/lib/python3.9/dist-packages/pip/_internal/cli/base_command.py", line 105, in _run_wrapper
[2025-05-02T05:08:04.452Z]     status = _inner_run()
[2025-05-02T05:08:04.452Z]   File "/usr/local/lib/python3.9/dist-packages/pip/_internal/cli/base_command.py", line 96, in _inner_run
[2025-05-02T05:08:04.452Z]     return self.run(options, args)
[2025-05-02T05:08:04.452Z]   File "/usr/local/lib/python3.9/dist-packages/pip/_internal/cli/req_command.py", line 68, in wrapper
[2025-05-02T05:08:04.452Z]     return func(self, options, args)
[2025-05-02T05:08:04.452Z]   File "/usr/local/lib/python3.9/dist-packages/pip/_internal/commands/install.py", line 459, in run
[2025-05-02T05:08:04.452Z]     installed = install_given_reqs(
[2025-05-02T05:08:04.452Z]   File "/usr/local/lib/python3.9/dist-packages/pip/_internal/req/__init__.py", line 83, in install_given_reqs
[2025-05-02T05:08:04.452Z]     requirement.install(
[2025-05-02T05:08:04.452Z]   File "/usr/local/lib/python3.9/dist-packages/pip/_internal/req/req_install.py", line 867, in install
[2025-05-02T05:08:04.452Z]     install_wheel(
[2025-05-02T05:08:04.452Z]   File "/usr/local/lib/python3.9/dist-packages/pip/_internal/operations/install/wheel.py", line 728, in install_wheel
[2025-05-02T05:08:04.452Z]     _install_wheel(
[2025-05-02T05:08:04.452Z]   File "/usr/local/lib/python3.9/dist-packages/pip/_internal/operations/install/wheel.py", line 614, in _install_wheel
[2025-05-02T05:08:04.452Z]     success = compileall.compile_file(path, force=True, quiet=True)
[2025-05-02T05:08:04.452Z]   File "/usr/lib/python3.9/compileall.py", line 255, in compile_file
[2025-05-02T05:08:04.452Z]     msg = err.msg.encode(sys.stdout.encoding,
[2025-05-02T05:08:04.452Z] TypeError: encode() argument 'encoding' must be str, not None
[2025-05-02T05:08:04.452Z] 

KumoLiu avatar May 02 '25 10:05 KumoLiu

/build

KumoLiu avatar May 02 '25 10:05 KumoLiu

The blossom issue is related to the current pytype version, so I've added <=2024.4.11 to the requirements-dev.txt file for it.

ericspod avatar May 02 '25 12:05 ericspod

/build

KumoLiu avatar May 02 '25 12:05 KumoLiu

The blossom issue is related to the current pytype version, so I've added <=2024.4.11 to the requirements-dev.txt file for it.

Seems related to the new version of the pip: https://pypi.org/project/pip/#history I tried downgrade it to 25.0.1, then it works.

raise issue here: https://github.com/google/pytype/issues/1909

KumoLiu avatar May 02 '25 13:05 KumoLiu

Hi @KumoLiu this should pass tests now so we can run blossom and merge. Thanks!

ericspod avatar Jul 18 '25 23:07 ericspod

/build

KumoLiu avatar Jul 21 '25 05:07 KumoLiu

Hi @KumoLiu something didn't work with blossom, could you share the logs please?

ericspod avatar Jul 22 '25 12:07 ericspod

/build

KumoLiu avatar Jul 22 '25 14:07 KumoLiu

Hi @ericspod, looks like a timeout issue, just retrigger the tests.

KumoLiu avatar Jul 22 '25 14:07 KumoLiu

/build

KumoLiu avatar Jul 23 '25 15:07 KumoLiu

I get this from my uv sync:

  × No solution found when resolving dependencies for split (python_full_version == '3.12.10' and sys_platform == 'darwin'):
  ╰─▶ Because monai==1.5.0 depends on torch>=2.4.1,<2.7.0 and your project depends on monai==1.5.0, we can conclude that your project depends on torch>=2.4.1,<2.7.0.
      And because your project depends on torch==2.7.1, we can conclude that your project's requirements are unsatisfiable.

Am I missing something?

ogencoglu avatar Aug 23 '25 06:08 ogencoglu

I get this from my uv sync:

  × No solution found when resolving dependencies for split (python_full_version == '3.12.10' and sys_platform == 'darwin'):
  ╰─▶ Because monai==1.5.0 depends on torch>=2.4.1,<2.7.0 and your project depends on monai==1.5.0, we can conclude that your project depends on torch>=2.4.1,<2.7.0.
      And because your project depends on torch==2.7.1, we can conclude that your project's requirements are unsatisfiable.

Am I missing something?

This doesn't look like the updated requirements that this PR has changed. MONAI 1.5 won't have this PR integrated so you will have to install MONAI from source. For Darwin the requirement is only torch>=2.4.1.

ericspod avatar Aug 31 '25 22:08 ericspod

I get this from my uv sync:

  × No solution found when resolving dependencies for split (python_full_version == '3.12.10' and sys_platform == 'darwin'):
  ╰─▶ Because monai==1.5.0 depends on torch>=2.4.1,<2.7.0 and your project depends on monai==1.5.0, we can conclude that your project depends on torch>=2.4.1,<2.7.0.
      And because your project depends on torch==2.7.1, we can conclude that your project's requirements are unsatisfiable.

Am I missing something?

This doesn't look like the updated requirements that this PR has changed. MONAI 1.5 won't have this PR integrated so you will have to install MONAI from source. For Darwin the requirement is only torch>=2.4.1.

Hi @ericspod do we have a target date for releasing a v1.5 update or the v1.6?

Asking this because the torch v2.6 that's installed as part of pip install monai has the default bundled cuda runtime at v12.4.127 while my app needs cuda runtime >=12.6 but it finds the lower version of the cuda runtime lib.so for torch at <venv>/lib/python3.10/site-packages/nvidia/cuda_runtime/lib/libcudart.so.12.

I have a few ways to work around, but would love to see a new version of MONAI with your updates soon:

  • pip install torch==2.6.0 torchvision==0.21.0 torchaudio==2.6.0 --index-url https://download.pytorch.org/whl/cu126 then install monai, or
  • remove the cuda runtime lib.so from the site packages folder post monai installation so that torch will use the system's cuda runtime lib, or
  • pip install --upgrade nvidia-cuda-runtime-cu12 post monai installation but this raises an error torch 2.6.0 requires nvidia-cuda-runtime-cu12==12.4.127; platform_system == "Linux" and platform_machine == "x86_64", but you have nvidia-cuda-runtime-cu12 12.9.79 which is incompatible.

Thanks, Ming

MMelQin avatar Sep 15 '25 23:09 MMelQin

Hi @ericspod do we have a target date for releasing a v1.5 update or the v1.6?

Asking this because the torch v2.6 that's installed as part of pip install monai has the default bundled cuda runtime at v12.4.127 while my app needs cuda runtime >=12.6 but it finds the lower version of the cuda runtime lib.so for torch at <venv>/lib/python3.10/site-packages/nvidia/cuda_runtime/lib/libcudart.so.12.

I have a few ways to work around, but would love to see a new version of MONAI with your updates soon:

* `pip install torch==2.6.0 torchvision==0.21.0 torchaudio==2.6.0 --index-url https://download.pytorch.org/whl/cu126` then install monai, or

* remove the cuda runtime lib.so from the site packages folder post monai installation so that `torch` will use the system's cuda runtime lib, or

* `pip install --upgrade nvidia-cuda-runtime-cu12` post monai installation but this raises an error `torch 2.6.0 requires nvidia-cuda-runtime-cu12==12.4.127; platform_system == "Linux" and platform_machine == "x86_64", but you have nvidia-cuda-runtime-cu12 12.9.79 which is incompatible.`

Thanks, Ming

Hi @MMelQin we're working on a 1.5.1 release to include support for PyTorch 2.8. This will be out shortly.

ericspod avatar Sep 16 '25 11:09 ericspod