Thomas Viehmann

Results 227 comments of Thomas Viehmann

But how does the lack of propagation actually create indeterminism here?

So I think we'd be dropping the caching you link after #1956 , hopefully?

Hint from the expert (thank you @tfogal): This can be avoided by using flash-attention.

isn't torch.Tensor.copy_ a legit method?

I for one would love to see a constant folding pass.

What is the trace when this happens? But we identified this as unclear behaviour, but I'm wondering if the .to is from the user code or from a decomposition.

I think this is pretty dubious. It starts with not capturing the side-effects of importing as caught by the CI, but probably also impacts typecheckers etc. If you have the...

>> If you have the - arguably somewhat special - need to import litgpt.config without importing litgpt, how about you add the litgpt path to PYTHONPATH and import config that...

> We don't just use litgpt.config, we also use litgpt.args (so we'd have to patch both). And unfortunately this solution would leave us with no longer having the nice typechecking...

Can we get a big picture here, please? How much are we looking to save and what role does this play for the discussion in #169 ? How does this...