Dead-Bytes

Results 13 comments of Dead-Bytes

i have solved the same issue by doing some tweaks in importing packages

You tried changing the import as some libraries need to be corrected, from keras import to from tensorflow.python.utils import ?

hey @AlexCheema i have solved this, Currently tested with smolLM llama model

@AlexCheema we can solve this by HuggingFaceInference Engine as i used transformers and torch for compute which can be easily run on windows

also from huggingface face inference engine this is solved

@jshnjovu No i didnt tested sharding on different windows and model speed tests, i built the InferenceEngine and worked on solving LlamacppInference but didnt heard back from the team so...

@jshnjovu also llamacpp uses ggml backend which can be broken down into subgraphs and composed on each node i started that work but it is on halt currently .

i had the same error gguf.GGMLQuantizationType.TL1 here TL1 is not getting imoported does GGMLQuantizationType have it?>

i commented out the lines its working good now, i guess the gguf.GGMLQuantizationType does not have TL1/2 rn