Dr Sujit Vasanth

Results 35 comments of Dr Sujit Vasanth

I wonder if it is possible to use the bitsandbytes library to quantize MoE-LLaVA-OpenChat via transformers library or even direct from your inference demo with deepspeed..Im going to have a...

@MiskaWasTaken please test the code out and let me know what you think

hi see this standalone example I made for using Llava models https://github.com/sujitvasanth/TinyLlava-Tk there is a further moondreamTk.py in the repository that does the same for moondream model. A video demonstration...

thankyou - was a massive improvement in speed by using disable_contact_processing : True I was able to go from 128 robots to 1024 in real time...and as you say now...

I encoutered similar errors - so you could try what worked for me which is on the same lines to Milad-Rakhsha's comment: try to make sure that contact reporter (API)...

I tried the movie capture extension but it isn't very friendly and produces frames or wrongly timed videos its a poor solution but I record my videos using OBS studio...

Hi I have a RTX 3060 desktop and had similar poblems until I upgraded to latest versions of pytorch with CUDA and upgraded the proprietary ubuntu driver to 515. Here...

hi I did exactly this and it works at a reasonable speed on CPU here's my code using a downloaded gguf model ``` from llama_cpp import Llama llm = Llama(...

here is my working code in python 3.8.10 ubuntu 20.04 Im using thebloke's quantised version (4Gb) https://huggingface.co/TheBloke/openchat-3.5-0106-GPTQ on rtx3090 but I think it needs far less resources this version has...

thanks @LinB203 you answer all the questions quickly. thankyou! 1. Please consider changing your readme to **tested on** python 3.10 rather than requirement 2. Yes I think you are right...