Julian Quevedo comments

Repositories
Issues
Comments

Results 3 comments of


                                            Julian Quevedo

INT8 Support for GPT models

We managed to use all int8 weights for `GptContextAttentionLayer`, `DecoderSelfAttentionLayer`, and `FfnLayer` using `int8WeightPerChannelLdkMultiplicationLauncher`. Since this function only supports `m = 1` and `m = 2`, we used for-loops when...

[Roadmap] GraphGym via PyTorch Lightning and Hydra 🚀

> Integrate `LightningDataset`, `LightningNodeData` and `LightningLinkData` modules New here: what do `LightningNodeData` and `LightningLinkData` refer to? > Refactor `load_ckpt` and `save_ckpt` with PL checkpoint save and load method Is this...

repeat_interleave or alternative needed to unpack quantized weights

#974: > Yeah, on-chip indexing through shared memory isn't supported yet. It's on the roadmap though, but it's a pretty advanced feature so we haven't come up with a specific...