Alex Cheema comments

Results 418 comments of


                                            Alex Cheema

[BOUNTY - $100] Support running any model from huggingface

I assigned you both @komikat @AReid987 you both will receive the bounty for any meaningful work towards this - feel free to work independently or together, up to you,

[BOUNTY - $100] Support running any model from huggingface

> hi @AlexCheema, llama.cpp [seems](https://github.com/ggerganov/llama.cpp/discussions/6404) to natively support sharding using gguf-split, could we just use that to shard the downloaded gguf and run it on connected nodes? I also feel...

[BOUNTY - $100] Support running any model from huggingface

> I'm not sure if there is a way to run .gguf files on pytorch. Huggingface can be done but would have to be dequantised. Since there already is a...

[BOUNTY - $100] Support Llama 3.2 1B on tinygrad

> Hi @AlexCheema, > > I’d love to work on adding support for Llama 3.2 1B in tinygrad. > > Thanks! Sanchay Go for it!

[BOUNTY - $200] Manual networking with configuration files

> Hello. Did the YAML that I did send you via discord is good for you ? Thanks in advance. Best Regards. Benjamin. It looks a bit overcomplicated. Basically you...

Update device_capabilities.py: Add GTX 1070, 1080; main.py: timeout 90->900

I'm concerned with increasing the timeout this much. If a request would take this long, I'd say it should be treated differently. Request handling generally needs to be reworked with...

[BOUNTY - $500] Support multiple models running concurrently

> Hi I would like to work on this Assigned. Good luck - pls tag me here or on Discord if you have any questions or run into bugs!

[BOUNTY - $500] Support multiple models running concurrently

Increased bounty to 500 USD as this appears to be harder than anticipated.

[BOUNTY - $500] Support multiple models running concurrently

No activity for a month. Opening this back up.

[BOUNTY - $500] Support multiple models running concurrently

> Hey sorry was busy with college > > While working on this I found out that the engines had a race condition within the inference engines. When I tried...