Carlos Mocholí
Carlos Mocholí
If the newer `save` is used, the argument order seems to have changed in https://github.com/pytorch/pytorch/pull/117772 ```python /home/carlos/nightly-env/lib/python3.10/site-packages/torch/distributed/checkpoint/utils.py:409: UserWarning: The argument order of save has been changed. Please check the document...
Technically lit-gpt doesn't rely on nightly since the 2.2 release. I opened #19463
Also opened https://github.com/pytorch/pytorch/issues/119802 upstream. We might want to silence these after this is resolved
https://github.com/pytorch/pytorch/issues/119800#issuecomment-1942156271 suggests that we should replace (in 2.2+) most of what we have with `{get,set}_{model,optimizer}_state_dict` functions in https://github.com/pytorch/pytorch/blob/v2.2.0/torch/distributed/checkpoint/state_dict.py
We have support for a limited set of scripts at https://github.com/Lightning-AI/litgpt/tree/main/xla. Give it a shot, it should work with v4-32. Some info may be outdated
@rasbt We follow the same initialization as Microsoft's: https://github.com/microsoft/LoRA/blob/main/loralib/layers.py#L266-L271 which itself matches what you propose: https://github.com/pytorch/pytorch/blob/main/torch/nn/modules/linear.py#L106-L109
This stub is only defined so it appears in the docs. Removing kwargs will mean this now raises ```python class MyModel(LightningModule): def forward(self, *inputs, **kwargs): return super().forward(*inputs, **kwargs) m =...
This can be based off https://github.com/EleutherAI/cookbook/blob/main/calc/calc_transformer_mem.py https://vram.asmirnov.xyz/ This could be run at the beginning of the training script or be a separate script that you call. (from #920)
This is blocked by https://github.com/pytorch/xla/issues/4988
Another FAQ would be support for dynamic shapes