Thomas Viehmann
Thomas Viehmann
Looking at what happened: I think while the visitor re-executes all things not replaced, it does so with the old outputs, not the new ones (in contrast to `interpret_trace` /...
@crcrpar Thank you for pointing that out! So what kind of delay should we have to be sure the benchmarking works without it?
@crcrpar @mpatel31415 So with a few more weeks, are we more confident?
So I'm still not 100% sure about the motivation: What is the harm in the status quo that this will be fixing? Currently, the task would be to have `thunder.jit(...,...
> Are people more comfortable with enabling it by default in the hidden thunder.jit that happens inside the dynamo frontend? I think this would fit well with the philosophy of...
Looks great, thank you @ysjprojects . we limit the default kv-cache size, though?
> `left_padding = not torch.sum(input_ids[:, -1] == torch.tensor(self.pad_token_id))` Note that this looks pretty bad from a "data dependent control flow perspective" and has, indeed, been changed in transformers four months...
Right, I'm stupid. They changed it for modelling_llava_next.py not modelling_llava.py. :(
Hi @moghadas76 it would be great to have it, note, though, that given the large amount of interest, this is likely a time-sensitive endeavour. If you plan to implement it,...
use of scopes (see also https://github.com/Lightning-AI/lightning-thunder/issues/935 )