raster-vision Bump torch from 2.2.2 to 2.3.0 in /rastervision_pytorch

Bumps torch from 2.2.2 to 2.3.0.

Release notes

PyTorch 2.3: User-Defined Triton Kernels in torch.compile, Tensor Parallelism in Distributed

PyTorch 2.3 Release notes

Highlights

Backwards Incompatible Changes

Deprecations

New Features

Improvements

Bug fixes

Performance

Documentation

Highlights

We are excited to announce the release of PyTorch® 2.3! PyTorch 2.3 offers support for user-defined Triton kernels in torch.compile, allowing for users to migrate their own Triton kernels from eager without experiencing performance complications or graph breaks. As well, Tensor Parallelism improves the experience for training Large Language Models using native PyTorch functions, which has been validated on training runs for 100B parameter models.

This release is composed of 3393 commits and 426 contributors since PyTorch 2.2. We want to sincerely thank our dedicated community for your contributions. As always, we encourage you to try these out and report any issues as we improve 2.3. More information about how to get started with the PyTorch 2-series can be found at our Getting Started page.

... (truncated)

Commits

97ff6cf [Release only] Release 2.3 start using triton package from pypi (#123580)
fb38ab7 Fix for MPS regression in #122016 and #123178 (#123385)
23961ce [Release/2.3] Set py3.x build-environment name consistently (#123446)
634cf50 [Wheel] Change libtorch_cpu OpenMP search path (#123417) (#123442)
12d0e69 update submodule onnx==1.16.0 (#123387)
38acd81 [MPS] Fwd-fix for clamp regression (#122148) (#123383)
b197f54 Use numpy 2.0.0rc1 in CI (#123356)
dc81d19 [CI] Test that NumPy-2.X builds are backward compatible with 1.X (#123354)
108305e Upgrade submodule pybind to 2.12.0 (#123355)
a8b0091 Make PyTorch compilable against upcoming Numpy-2.0 (#121880) (#123380)
Additional commits viewable in compare view

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.

Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR:

@dependabot rebase will rebase this PR
@dependabot recreate will recreate this PR, overwriting any edits that have been made to it
@dependabot merge will merge this PR after your CI passes on it
@dependabot squash and merge will squash and merge this PR after your CI passes on it
@dependabot cancel merge will cancel a previously requested merge and block automerging
@dependabot reopen will reopen this PR if it is closed
@dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
@dependabot show <dependency name> ignore conditions will show all of the ignore conditions of the specified dependency
@dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
@dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
@dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)

May 01 '24 18:05 dependabot[bot]

Another option would be to remove any figure logging from TorchGeo entirely. If users want it, they can subclass the trainer and add it. This feature has historically been a nightmare to maintain/test and has been riddled with bugs. The proposal here is to make it even more complicated.

More positively, can you start a discussion with the Lightning folks to see if there is an easier way to do this automatically? Or a way that can unify the interface somehow? This just seems like too much work to support out of the box.

Jun 29 '24 20:06 adamjstewart

RE lightning I see a lot of work was done in https://github.com/Lightning-AI/pytorch-lightning/pull/6227 but ultimately they decided this was too complex to maintain, and closed the PR.

Perhaps as you say, logging figures should be left to users to subclass the trainers. I think basic logging of samples on the dataset is a necessity however

Jun 30 '24 07:06 robmarkcole

I've actually been moving away from adding the plotting directly inside the LightningModule and using a custom Callback instead. There are methods like on_validation_batch_end which have access to the current batch, module, batch_idx, etc. which can essentially replicate what the current figure logging was doing.

See https://lightning.ai/docs/pytorch/stable/_modules/lightning/pytorch/callbacks/callback.html#Callback.on_validation_batch_end

Jul 22 '24 03:07 isaaccorley

I agree that using the callbacks will be the most intuitive approach. Do we think it's best to utilise them or just have a 'best practice' guide perhaps?

Jul 22 '24 15:07 robmarkcole

Probably better to have in a tutorial or just reference the Lightning docs so we don't have to maintain them.

Jul 22 '24 15:07 isaaccorley

Been reading on Rastervision, perhaps we can utilise its plotting functionality with lightning callbacks - a simple tutorial could suffice https://docs.rastervision.io/en/stable/usage/tutorials/visualize_data_samples.html

Aug 06 '24 08:08 robmarkcole

Is there something wrong with TorchGeo's plotting functionality (i.e., dataset.plot(sample))? It's certainly not perfect, but I don't know if Raster Vision's is any better.

Aug 06 '24 10:08 adamjstewart

Nothing wrong - I thought at some point there was a comment or issue about externalising the plotting

Aug 06 '24 12:08 robmarkcole

I'm for this (even getting rid of the datamodule plots) on the condition that we write a tutorial that shows how to subclass and do this with tensorboard. torchgeo doesn't have to (and shouldn't try to) do everything by itself --- the more that we can offload to the user with tutorials the better! This is definitely a good example as I've had to overwrite all of val_step specifically to implement mlflow logging for Azure ML in my own projects.

Sep 22 '24 15:09 calebrob6

I'm all for tutorials - good to hear you are also mlflow user as that is also in my stack and I've made a bunch of customisations to support it, be good to share best practice

Sep 22 '24 16:09 robmarkcole

Bump torch from 2.2.2 to 2.3.0 in /rastervision_pytorch_learner

PyTorch 2.3: User-Defined Triton Kernels in torch.compile, Tensor Parallelism in Distributed

PyTorch 2.3 Release notes

Highlights