Carlos Mocholí
Carlos Mocholí
### 📚 Documentation There's several TPU specific topics to update in the documentation before the v2.1 release ### Update the guide in Trainer with: - [ ] v4 support -...
### Description & Motivation `trainer.fit` only works with `CombinedLoader(..., mode="max_size_cycle"|"min_size")` `trainer.{validate,test,predict}` only works with `CombinedLoader(..., mode="sequential")` This constraint is checked in the top-level loops: https://github.com/Lightning-AI/lightning/blob/0009cde1db1a9ab4e2f1e0a9f69a4affb59d5134/src/lightning/pytorch/loops/fit_loop.py#L351-L354 https://github.com/Lightning-AI/lightning/blob/0009cde1db1a9ab4e2f1e0a9f69a4affb59d5134/src/lightning/pytorch/loops/evaluation_loop.py#L182-L183 ### Pitch Have all...
## What does this PR do? Fixes #18936 The main change is in `src/lightning/pytorch/strategies/strategy.py`. Everything else is a bit of flattening and making it as consistent as possible with its...
## What does this PR do? Part of #16130 Sets up XLA testing on CUDA CI and other related pieces to get things working - Add connector support for non-TPU...
## What does this PR do? Fixes https://github.com/Lightning-AI/lightning/issues/10436 Logging per step during fit's validation, regular validation, testing, or predicting is not generally useful when the logged values depend on the...
## What does this PR do? Fixes (keep open) #7534 for Lite Fixes #13821. TODO: - Integrate into the strategies - Clean up random distributed files (marked with FIXME). -...
## What does this PR do? :crossed_fingers: ### Does your PR introduce any breaking changes? If yes, please list them. None cc @carmocca @justusschock @awaelchli @borda
## What does this PR do? Guide: https://cloud.google.com/tpu/docs/v5e-training
### Description & Motivation https://github.com/pytorch/pytorch/pull/104810 adds the recommendation that the `save` APIs should be called in a single node (`shard_group`). https://github.com/pytorch/pytorch/issues/102904#issuecomment-1862892480 Also talks about this Our logic doesn't do this...
### Bug description See added deprecation warnings in https://github.com/pytorch/pytorch/pull/113867 ### What version are you seeing the problem on? v2.2 ### How to reproduce the bug Originated from https://github.com/Lightning-AI/pytorch-lightning/blob/b097a4df3f3fa8b4465861ccab17a44a8ae1ebb9/src/lightning/fabric/strategies/fsdp.py#L496 We already...