MohnishJain comments

Repositories
Issues
Comments

Results 1 comments of


                                            MohnishJain

GPTQ quantization for MPT-30 models

@Narsil are you planning to rollout ,GPTQ implementation for MPT-30 B.the model has good support of 8 K Input tokens.Current implementation also has memory fragmentation issues. For flash causal LM...