Alex Cheema
Alex Cheema
> I implemented a temporary workaround using approach 2 in [#656](https://github.com/exo-explore/exo/pull/656). I suppose this isn't a full solution for multi-gpu, it's just a wrapper on VISIBLE_DEVICES. This will be supported...
Hey @kuri54 this sounds like a good use-case of exo. exo allows you to set up an on-premise AI cluster easily without needing a dedicated person or team to set...
Might be a good one for you @varshith15 given you did this already for other vision LLMs!
Added $200 retrospective bounty
This is interesting -- so you're saying we should model the relationship between message size and bandwidth/latency and use that information to change how we split the model?
@abdussamettrkr drop a comment here if you need any help / run into any issues. here to help
Shouldn't matter. I've seen people run exo with 23 Mac Minis with no issues:  Can you give more info about the other devices that couldn't find each other? Are...
I believe you can already do this with tinygrad by specifying `VISIBLE_DEVICES` see https://docs.tinygrad.org/env_vars/
I'm still not sure exactly how this helps. It reraises the exception so doesn't it just end up with the same behaviour with a slightly different print?
We could automatically cancel the inference when the connection is lost. What do you think?