Thomas Viehmann comments

Results 227 comments of


                                            Thomas Viehmann

Refactoring of `GPT.forward` when it comes to `input_pos` and KV cache usage

> you need to generate tokens up to the maximum length Well, so the rule is basically that the launch configuration of and parameters to the GPU kernel calls can't...

trace: add cursor to for bsym insertion

Some more detail: - make new symbols go to the right place, - how the user sets it. Given that we use the scopes, for the first, I'd probably push...

TensorProxy.shape should be unpacked automatically

> This problem can also be properly resolved in prologue trace. i.e. here i1 is unpacked in prologue, because it is consumed by the top level symbol ltorch.getitem. Unfortunately the...

Calling general_thunder_jit inside lookasides doesn't work

Related: - #1134 Issues for the steps: - #1222 - #1220 After 1220 is solved, we could use the present issue to track the remainder of the work. Inside the...

enable HF transformers 4.43.

@KaelanDt

dce pass does not treat correctly `DONT_DCE` tag in sub-symbols

That seems rather unclear to me. I think we should not, possibly we should remove some subsymbols.

Perf is not great on HF Transformers Llama 3.2 1B

As an update, replacing `thunder_model` with ``` recipe = thunder.recipes.HFTransformers() recipe.executor_names = [ 'nvfuser', 'inplace_index_copy_ex', 'sdpa_mask_transform_ex', ] thunder_model = thunder.compile( model, recipe=recipe, # plugins=thunder.plugins.ReduceOverhead(), # CUDAGraphs will produce garbage output...

Thomas Viehmann