exo icon indicating copy to clipboard operation
exo copied to clipboard

Tinychat only display llama models to download, GPU and/or CPU instructions may not be recognized properly

Open GrahamJenkins opened this issue 10 months ago • 4 comments

Hello, I've been trying to debug things a bit, I have a Ubuntu desktop with an Nvidia card, drivers, CUDA, and pretty much everything that I think I need installed. I referred to this guide (https://medium.com/@juancrrn/installing-cuda-and-cudnn-in-ubuntu-20-04-for-deep-learning-dad8841714d6) loosely (different versions) and everything seems to be running in terms of GPU.

When I open tinychat, the only model options available are llama related. I also have Exo running on a mac mini and it shows several others, but these don't show on my linux host.

When I run Exo, I see the following warnings:

2025-02-14 16:47:47.634644: I tensorflow/core/util/port.cc:153] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable TF_ENABLE_ONEDNN_OPTS=0.
2025-02-14 16:47:47.645333: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:477] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
WARNING: All log messages before absl::InitializeLog() is called are written to STDERR
E0000 00:00:1739580467.657869   35578 cuda_dnn.cc:8310] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
E0000 00:00:1739580467.662027   35578 cuda_blas.cc:1418] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2025-02-14 16:47:47.676265: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.                                                                                                      
To enable the following instructions: AVX2 AVX_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
Selected inference engine: None

I've searched and attempted to resolve them but it doesn't seem to be working.

Regarding the notices: oneDNN: Seems to be a warning and able to be disabled Unable to register {cuFFT, cuDNN, cuBLAS}: From what I read in tensorflow issues, these are warnings and won't prevent the detection and use of GPUs. TeonsorFlow binary...performance-critical: I read that this just states that it may offload some load to CPU, but it still prioritizes GPU, just info/warning To enable the following instructions: I'm not sure if this is a problem or just a warning.

When I run Exo on both devices, the linux system sees the mac mini, but the mac mini thinks it's alone, so I wonder if they are communicating properly.

When Exo runs, it displays my computer with a Nvidia card, but registers 0.0TFLOPS. So I suspect that while it recognizes the card, it's having trouble accessing it. Finally, this same machine works fine with LM Studio and Ollama, so I'm a bit confused about why it is not detecting/running smoothly.

I'd be happy to provide logs or more details if anyone has suggestions on where to continue debugging.

GrahamJenkins avatar Feb 15 '25 00:02 GrahamJenkins

Perhaps this isn't user error, I just saw #697 which appears to display similar behavior. Namely that mac displays many models while non man displays only llama. Are these related?

GrahamJenkins avatar Feb 19 '25 09:02 GrahamJenkins

you need to run mlx(mac only) to display and use more models, i havent tried running them from the commandline, but using the user interface only mac displays many models while using mlx. linux using tinygrad has very little models available in the UI(lama only), maybe when the pytorch support is complete it will display more, there is a branch of pytorch that works, but the UI only displays lama models again.

BUT this seems to be a different issue after reading, i did not have this issue, my linux machines can see mac and my mac machines can see linux.

i see this "Selected inference engine: None" means you did not select a proper inference engine for your hardware, possibly where the error is coming from?? what command are you running to execute exo??

when i did it on my mac and linux setup, after installing nvidia tool kit and drivers on the linux machine, i simply issued the exo command on both machines and it works.

think i also had to add exo the the security settings for the mac, to enable modifying the harddrive.

woolcoxm avatar Feb 19 '25 14:02 woolcoxm

是否可以用mac加载deepseek模型,之后再windows tinychat上运行????

spacemoonfly avatar Feb 21 '25 17:02 spacemoonfly

我还没有找到,据我所知它只适用于 Mac

sorry if this is incorrect, i translated using google.

woolcoxm avatar Feb 21 '25 19:02 woolcoxm