MohnishJain
Results
1
comments of
MohnishJain
@Narsil are you planning to rollout ,GPTQ implementation for MPT-30 B.the model has good support of 8 K Input tokens.Current implementation also has memory fragmentation issues. For flash causal LM...