Carlos Mocholí comments

Results 427 comments of


                                            Carlos Mocholí

Improve no_autocast overhead from 3.1 µs to 0.5 µs (6x improvement)

Hi friends. You should be able to check this by jitting a model with the `torch.compile` executor enabled, making sure that `fullgraph=True` is set (disallowing recompiles). It should be an...

Support for torchvision models, e.g., a simple ViT

Hey Seb! @nikitaved Just merged a PR to improve the messaging here: #78 The TLDR is that you want to run `examine` on the model to get a report of...

`Chat` consumes more VRAM than `Generate`

> But it's easily fixable. We can preallocate kv-cache for the first turn in the same fashion as in the generate script and then, if in the current turn the...

Whether clarification/documentation/redesign is needed for customizing LightningCLI subcommands

> It would be good to include it so that people are more aware of a recommended way to do this. Fully agree. > am not convinced that it is...

Generating batch outputs?

Sorry, this is not implemented at the moment for simplicity in understanding the generation code, (it's inherited from nanoGPT)

Generating batch outputs?

I'm not 100% familiar with the advantages of left vs right so if one of you has a good resource on this, I'd appreciate it if you could share it

From what I understand, right padding will not require creating an attention mask (so you can keep using flash attention), but then one cannot simply `-1` here: https://github.com/Lightning-AI/lit-gpt/blob/main/generate/base.py#L62