Carlos Mocholí
Carlos Mocholí
litdata is meant to be used with a regular `DataLoader`, so there's nothing specific to do on a TPU machine. If you use Fabric or PyTorch Lightning, that will take...
You are correct Kevin. This is not an nvfuser issue. The code was also using some removed arguments. I updated the description
Debugging shows that `qkv` is `float32` in https://github.com/Lightning-AI/lightning-thunder/blob/main/thunder/tests/lit_gpt_model.py#L460 which is unexpected
Are you sure? If I print the dtype in the model ```diff $ git diff diff --git a/thunder/tests/lit_gpt_model.py b/thunder/tests/lit_gpt_model.py index 04d7cfa8..5cc44e46 100644 --- a/thunder/tests/lit_gpt_model.py +++ b/thunder/tests/lit_gpt_model.py @@ -458,6 +458,7 @@...
@nikitaved Faster and better code is very welcome in LitGPT. I benchmarked a few different implementations when this was added and this came out to be the best in general...
`nonzero` doesn't have a `shape=` argument. Did you mean `as_tuple=`?
`get_all_executors` makes sure that all the 1st party executors are imported: https://github.com/Lightning-AI/lightning-thunder/blob/b1f447022b0732e83c11661c30746568280834f7/thunder/extend/__init__.py#L354-L366. This is necessary to avoid silently returning `None` for executors that are importable. One simple fix could be...
You could maybe write a test that mocks the `torch.compile` call here: https://github.com/Lightning-AI/lightning-thunder/blob/main/thunder/executors/torch_compile.py#L84 and asserts that the correct values were passed. You could also force a graph-break in your function...
`get_compile_option` doesn't get invoked automatically. You need to add it in your PR. If I'm not mistaken, right before `torch.compile()` gets called `make_compiled` will get called on the first call...
A design flaw with `python_print` is that common debugging patterns like `python_print(f"Before layernorm: {x}")` doesn't work because the string conversion for x is already done so you get `Before layernorm:...