Alex Cheema comments

Results 404 comments of


                                            Alex Cheema

Can a large model be successfully loaded across multiple EXO nodes with insufficient individual memory?

> we are loading up mac studios and notice that on subsequent inference, memory will suddenly spike on one of the nodes as if it is trying to reload model...

Can I set the maximum exo memory usage for each node? Prevent oom

Try again on latest version. Some fixes in there for Linux, specifically around memory usage

Manual networking with configuration files

lgtm

Manual networking with configuration files

Merged. Please email [email protected] with your ethereum address for your $200 bounty in USDC.

How install on android

Are you using a terminal emulator? Which device? Which version of android?

we need the support for llama 3.2

The small llama 3.2 language models are already supported. The larger multimodal vision language models are yet to be implemented: #247

garbled Chinese characters

> When using the LLama 3.3 70B model, the text printed in the terminal is normal, but the content returned by the interface shows garbled Chinese characters. Hi, sorry to...

Will it run on Raspberry Pi?

Hold off on this until we have PyTorch support merged (#139). You could already try, but others have run into issues and I'm not sure of the root cause.

Update README

I'm not sure about this. In tinygrad we use the default device which should be CUDA on an NVIDIA GPU instance. Source: https://docs.tinygrad.org/mnist/

Update README

Closing as I don't think this is necessary. Typically this means you didn't install CUDA.