Add Tailscale Support
I would like to add Tailscale Support by adding search for 100.x.x.x
This is a good idea, but we need to be careful about the implementation. Tailnets should be strongly de-prioritised over anything local - latency tends to be really important for the links we're using and Tailscale, especially in userspace on macOS, is sure to notably hurt it.
The profiles of links may take care of this, but needs lots of testing when someone comes to implement it!
Latency is often under 100ms (for me). TP (Tensor Parallelism) shouldn't be affected by high latency. Also latency tests can be done on the device to create optimal routes for processing.
Latency is often under 100ms (for me). TP (Tensor Parallelism) shouldn't be affected by high latency. Also latency tests can be done on the device to create optimal routes for processing.
Tensor Parallelism is highly sensitive to latency. It requires low double digit microsecond latency to give a speedup.
Pipeline Parallel on the other hand is perfectly fine over higher latency links, i.e. tens of milliseconds (3-4 OOM more than is acceptable with Tensor Parallellism).
There’s also a case for simple data parallel over something like tailscale, i.e. horizontally. scaling small models
Why is TP affected by high latency? According to my knowledge TP computes the layers horizontally. After one layer calculation is needed for next layer (matrix mul). Using Tailscalr correctly exit nodes and Gateways can be used for low latency in the single digits.
You can do this in userspace right now, actually. You need to add your tailscale interface to your mDNS multicast group, and the rest should work automagically. In a sense, it's actually a tailscale setting not something in EXO. While we could autoconfigure it, we really shouldn't mess with user settings in that way.
(In theory. I have not tested this.)
I will test this tomorrow. My system is currently down.
Some reading online suggests it may not be possible to do this without forking tailscale - I guess we need manual configuration overrides, we can scan & dial tailscale addresses afterwards.