Alex Cheema
Alex Cheema
> If ok, I would also like to fix the bug That would be welcomed. Thank you.
Happy for you to take this one. Your approach looks good except it doesn’t take into account latency. A few tweaks to take into account latency should make it work.
do you mean a custom model? or dataset?
> Latency is often under 100ms (for me). TP (Tensor Parallelism) shouldn't be affected by high latency. Also latency tests can be done on the device to create optimal routes...
First of all, thanks a lot for taking the time to run exo when it's still experimental. Most of all, thank you so much for making an issue - these...
I pushed a quality of life improvement so you can use the ChatGPT api endpoint from any node https://github.com/exo-explore/exo/commit/8a35fd83f6e07b51b62e0dbe49028c9ef5f0455b
Closed by mistake
Can you try this again @matt-pulsipher? I can't reproduce anymore, and I fixed a few things recently.
> > First of all, thanks a lot for taking the time to run exo when it's still experimental. Most of all, thank you so much for making an issue...
Awesome, let me know if you need any help / want to run anything by me! Also, I love this idea - made an issue for it here: https://github.com/exo-explore/exo/issues/52