Ezhil Raj Selvaraj
Results
3
comments of
Ezhil Raj Selvaraj
use BetterTransformer this will reduce the inference of 20-30 persentage
from transformers import BetterTransformer model = BetterTransformer.transform(model, keep_original_model=False)
Hi, I think, i have temporary solution for this issue. in the line 48 of the file ("\metavoice/metavoice-src/fam/llm/fast_inference_utils.py ") you have to command the below line. torch._inductor.config.fx_graph_cache = ( True...