Julian Quevedo

Results 3 comments of Julian Quevedo

We managed to use all int8 weights for `GptContextAttentionLayer`, `DecoderSelfAttentionLayer`, and `FfnLayer` using `int8WeightPerChannelLdkMultiplicationLauncher`. Since this function only supports `m = 1` and `m = 2`, we used for-loops when...

> Integrate `LightningDataset`, `LightningNodeData` and `LightningLinkData` modules New here: what do `LightningNodeData` and `LightningLinkData` refer to? > Refactor `load_ckpt` and `save_ckpt` with PL checkpoint save and load method Is this...

#974: > Yeah, on-chip indexing through shared memory isn't supported yet. It's on the roadmap though, but it's a pretty advanced feature so we haven't come up with a specific...