Peji-moghimi

Results 7 comments of Peji-moghimi

@dvmazur @lavawolfiee Can you please kindly address this question? I'd be happy to do this myself if it's not already possible, which I don't think it is, if you could...

Hi @dvmazur! Thank you for your reply. Unfortunately (or fortunately) I have 8 1080ti GPUs on my machine, which individually cannot seem to handle the model even with quantization and...

> > May I ask which quantization setup allowed compression down to 17Gb, or if you could point me to a file that contains that setup please? > > It's...

> > the model seems to only occupy ~11Gb on a single GPU without an OOM error, but then at inference there's no utilization of the GPU cores throughout (though...

I also have the same problem, except even running the `promptify_NER.ipynb` example notebook! For the sake of ease, here is the code snippet: ``` from promptify import Prompter,OpenAI, Pipeline model...

Weirdly it turns out, if the input is wrapped in triple quotes it runs just fine and very short span of time.

I also need to use this with LLama-Cpp python.