shraddhaa26 comments

Repositories
Issues
Comments

Results 1 comments of


                                            shraddhaa26

Problem Running Model with GPU

skip module injection for FusedLlamaMLPForQuantizedModel not support integrate without triton yet. The model 'LlamaGPTQForCausalLM' is not supported for text-generation. Supported models are ['BartForCausalLM', 'BertLMHeadModel', 'BertGenerationDecoder', 'BigBirdForCausalLM', 'BigBirdPegasusForCausalLM', 'BioGptForCausalLM', 'BlenderbotForCausalLM', 'BlenderbotSmallForCausalLM',...