Thomas Viehmann comments

Results 227 comments of


                                            Thomas Viehmann

improve shape accuracy in transform output by providing better tooling

Looking at what happened: I think while the visitor re-executes all things not replaced, it does so with the old outputs, not the new ones (in contrast to `interpret_trace` /...

remove jit(fsdp(model)) codepath

@crcrpar Thank you for pointing that out! So what kind of delay should we have to be sure the benchmarking works without it?

remove jit(fsdp(model)) codepath

@crcrpar @mpatel31415 So with a few more weeks, are we more confident?

Enable NvtxProfileTransform by default

So I'm still not 100% sure about the motivation: What is the harm in the status quo that this will be fixing? Currently, the task would be to have `thunder.jit(...,...

Enable NvtxProfileTransform by default

> Are people more comfortable with enabling it by default in the hidden thunder.jit that happens inside the dynamo frontend? I think this would fit well with the philosophy of...

qwen2.5 long context

Looks great, thank you @ysjprojects . we limit the default kv-cache size, though?

> `left_padding = not torch.sum(input_ids[:, -1] == torch.tensor(self.pad_token_id))` Note that this looks pretty bad from a "data dependent control flow perspective" and has, indeed, been changed in transformers four months...

HF LLaVa support

Right, I'm stupid. They changed it for modelling_llava_next.py not modelling_llava.py. :(

implement DeepSeek

Hi @moghadas76 it would be great to have it, note, though, that given the large amount of interest, this is likely a time-sensitive endeavour. If you plan to implement it,...

Transform writer's guide

use of scopes (see also https://github.com/Lightning-AI/lightning-thunder/issues/935 )