Alex Cheema

Results 404 comments of Alex Cheema

> I implemented a temporary workaround using approach 2 in [#656](https://github.com/exo-explore/exo/pull/656). I suppose this isn't a full solution for multi-gpu, it's just a wrapper on VISIBLE_DEVICES. This will be supported...

Hey @kuri54 this sounds like a good use-case of exo. exo allows you to set up an on-premise AI cluster easily without needing a dedicated person or team to set...

Might be a good one for you @varshith15 given you did this already for other vision LLMs!

Added $200 retrospective bounty

This is interesting -- so you're saying we should model the relationship between message size and bandwidth/latency and use that information to change how we split the model?

@abdussamettrkr drop a comment here if you need any help / run into any issues. here to help

Shouldn't matter. I've seen people run exo with 23 Mac Minis with no issues: ![image](https://github.com/user-attachments/assets/d6737468-00df-4df0-b499-fbc15fc8cb5d) Can you give more info about the other devices that couldn't find each other? Are...

I believe you can already do this with tinygrad by specifying `VISIBLE_DEVICES` see https://docs.tinygrad.org/env_vars/

I'm still not sure exactly how this helps. It reraises the exception so doesn't it just end up with the same behaviour with a slightly different print?

We could automatically cancel the inference when the connection is lost. What do you think?