billwu issues

Repositories
Issues
Comments

Results 2 issues of


                                            billwu

Is max_position_embeddings=8096 neccessary in 2b model?

I just try to do some small changes on model '2b' 1, Limit max_position_embeddings from 8096 to 256. :) 2, Trim kv-cache in GemmaAttention to max_position_embeddings(256). 3, Unlimit the output...

type:support

I guess mamba.step could be deleted if selective_scan_fn can accept ssm_state as an input param.

Maybe there are some benifit below: 1, The code could be simplier. 2, The inference could be faster. 3, The inference can accept multi-tokens in this way. There are some...