MohnishJain

Results 1 comments of MohnishJain

@Narsil are you planning to rollout ,GPTQ implementation for MPT-30 B.the model has good support of 8 K Input tokens.Current implementation also has memory fragmentation issues. For flash causal LM...