llama2.rs icon indicating copy to clipboard operation
llama2.rs copied to clipboard

shrink the IN dim respect to the SIMD

Open echosprint opened this issue 2 years ago • 9 comments

the SIMD_8 is used in the method matvec of QLinear, so the input x with (B,IN) should transformed into [[Simd<f32, 8>; B]; IN/8]

echosprint avatar Sep 01 '23 07:09 echosprint

change from [[Simd<f32, 8>; B]; IN] to [[Simd<f32, 8>; B]; IN/8] can greatly reduce the stack usage and slightly enhance the reference speed.

echosprint avatar Sep 01 '23 07:09 echosprint

This is a good change, does it impact speed?

I was considering adding a rust gated feature that lets you do generic constant arithmetic, but this solution is probably better for now.

srush avatar Sep 01 '23 12:09 srush

i tried the #![feature(generic_const_exprs)] #![allow(incomplete_features)] but it generates a bunch of warnings. On my desktop, the speed changed from achieved tok/s: 4.8163757 to achieved tok/s: 5.0361977

echosprint avatar Sep 04 '23 01:09 echosprint

sorry! I'll get this checked in. just made a bunch of other changes that I need to merge in.

srush avatar Sep 08 '23 01:09 srush

this proj is a great place to learn rust and llama and cuda(triton), very appreciated, hope to do something helpful to the proj

echosprint avatar Sep 11 '23 02:09 echosprint

Would love any contribution, I'm also learning Rust and Triton on the fly.

What if we try this library? It seems pretty cool. https://docs.rs/typenum/latest/typenum/

srush avatar Sep 11 '23 12:09 srush

Another idea would be to explore adding testing. Not sure how unit tests work in rust, but it would be nice to have these for small sizes.

srush avatar Sep 11 '23 12:09 srush

Would love any contribution, I'm also learning Rust and Triton on the fly.

What if we try this library? It seems pretty cool. https://docs.rs/typenum/latest/typenum/

import of the typenum would unnecessarily complicates the repo and make the code unintuitive just to bypass the limit of generic_const_expr

echosprint avatar Sep 12 '23 00:09 echosprint

Another idea would be to explore adding testing. Not sure how unit tests work in rust, but it would be nice to have these for small sizes.

I am happy to write some unit test after i carefully finish the reading source code

echosprint avatar Sep 12 '23 01:09 echosprint