exo icon indicating copy to clipboard operation
exo copied to clipboard

[TRACKING] Windows Support

Open xxwtiancai opened this issue 11 months ago • 16 comments

Hi, thank you for this great project!

I'm interested in running this project on Windows. Currently, I noticed there's no explicit documentation or support for Windows environments. Would it be possible to:

I'd like to ask if there are any solutions for running exo on Windows or WSL. I've tried all code versions from July until now and reviewed over 300 issues for possible solutions. Unfortunately, I still haven't managed to get it running successfully on my two Windows machines - I can't even find other nodes except my own. I would be very grateful if you could provide some optimization solutions for running on Windows.

xxwtiancai avatar Jan 16 '25 11:01 xxwtiancai

exo is still not supported for native Windows. I got my WSL networking issue working with this: #455

tensorsofthewall avatar Jan 17 '25 07:01 tensorsofthewall

exo is still not supported for native Windows. I got my WSL networking issue working with this: #455

Thanks for your advice. But it seems do not work on my windows pc. I have tried this issue, added a .wslconfig, exo run correctly, but only detect one node, my two pc still cannot find each other. They are in same network. Did you make other actions to make exo working on your Windows PC? Such as open/close Windows Firewalls.

xxwtiancai avatar Jan 17 '25 07:01 xxwtiancai

On Windows, my firewall is currently off. But I didn't have to do anything to get my WSL exo node detected and synced with my Mac node.

If this is the issue, you can add rules in your firewall to allow incoming/outgoing connections on ports used by exo. By default, exo uses UDP so you could allow incoming connections on ports 49152 to 65535. You can check for the specific port numbers in exo/main.py, mainly in args, if you don't want to expose all ports (I highly recommend you do this instead of option 1).

tensorsofthewall avatar Jan 17 '25 08:01 tensorsofthewall

Thanks a lot. I sure that I have tried to allow incoming/outgoing connections on ports, but the two of my windows PC still do not work. I want to know whether your WSL node detected your Mac? Becsuse as I known, many developers have the same problem that windows PC cannot detect any node, but other operate system pc can detect windows. But two of my pc are Windows :( .

xxwtiancai avatar Jan 17 '25 08:01 xxwtiancai

Yes, my WSL node was able to detect the Mac with no issues. I tested with disconnections and everything as well. I had issues with inference, but I'm still sorting that out as we speak. Can you enable DEBUG=9 and post the logs?

tensorsofthewall avatar Jan 17 '25 08:01 tensorsofthewall

Yes, i will test again and post the logs later. Thanks

xxwtiancai avatar Jan 17 '25 08:01 xxwtiancai

hey tensorsofthewall, I just make some changes to my two Windows PCs ( just kill IP helper which hold the ip 0.0.0.0:80), and one of pc can see the other pc. There still a question, which model is available for windows? I tried LLama3.2 1B, and error: size mismatch can't reshape self.shape=(1,28,2048,2048) to new shape (1,38,32,64) came out. So it still cannot work

xxwtiancai avatar Jan 21 '25 03:01 xxwtiancai

This is probably because tinygrad doesn't support llama 3.2+ models yet, but I'm not sure. In any case, tinygrad isn't supported on windows anymore, so it could be a different issue. I suggest that you wait for the PyTorch PR to be merged, that should provide support across platforms. See #139

tensorsofthewall avatar Jan 21 '25 03:01 tensorsofthewall

Thanks a lot. I'll follow this. BTW, do you know any framework can support distribute inference on multi intel CPU? I want to do some performance evaluation on different networking plan(wifi, thunderbolt3, etc).

xxwtiancai avatar Jan 21 '25 03:01 xxwtiancai

Was anyone able to get this working? I have exo running no issue on my windows wsl linux instances. i have added the wslconfig, i am on windows 11. i have portforward all possible udp ports. i can access the api from any machine and the web browswer instance but it wont recognize the other machines. only ever finds 1 node. any ideas?

DEBUG 9 LOGS: `king@ogking:/mnt/c/Users/t_jon$ DEBUG=9 exo None of PyTorch, TensorFlow >= 2.0, or Flax have been found. Models won't be available and only tokenizers, configuration and file/data utilities can be used. None of PyTorch, TensorFlow >= 2.0, or Flax have been found. Models won't be available and only tokenizers, configuration and file/data utilities can be used. Selected inference engine: None


/ _ \ / / _
| /> < (_) | _/_/____/

Detected system: Linux Inference engine name after selection: tinygrad get_inference_engine called with: tinygrad Using inference engine: TinygradDynamicShardInferenceEngine with shard downloader: HFShardDownloader Trying to find available port port=53705 [] Using available port: 53705 Generated and stored new node ID: 3647896f-c58c-42df-a191-0b682fbbff65 Chat interface started:

  • http://127.0.0.1:52415
  • http://172.19.137.7:52415 ChatGPT API endpoint served at:
  • http://127.0.0.1:52415/v1/chat/completions
  • http://172.19.137.7:52415/v1/chat/completions Model storage directory: /home/king/.cache/huggingface has_read=True, has_write=True tinygrad Device.DEFAULT='CLANG' Server started, listening on 0.0.0.0:53705 tinygrad Device.DEFAULT='CLANG' update_peers: added=[] removed=[] updated=[] unchanged=[] to_disconnect=[] to_connect=[] Collecting topology max_depth=4 visited=set() Collected topology: Topology(Nodes: {3647896f-c58c-42df-a191-0b682fbbff65: Model: Linux Box (Device: CLANG). Chip: Unknown Chip (Device: CLANG). Memory: 15962MB. Flops: fp32: 0.00 TFLOPS, fp16: 0.00 TFLOPS, int8: 0.00 TFLOPS}, Edges: {}) update_peers: added=[] removed=[] updated=[] unchanged=[] to_disconnect=[] to_connect=[] did_peers_change=False Collecting topology max_depth=4 visited=set()`

kingcozz avatar Jan 27 '25 20:01 kingcozz

Please I need a littile bit of help. I am trying to run exo on WSL as well. when i run just the exo command it shows me available commands. but when I try to run the command "exo run llama-3.2-3b" I get "error: unknown command "run" for "exo"

Did you mean this? runstatus" error. Please help.

Fahad16301139 avatar Feb 05 '25 02:02 Fahad16301139

+1

Youho99 avatar Feb 28 '25 00:02 Youho99

+1

maikzz32 avatar Mar 04 '25 20:03 maikzz32

What..?

Here, I thought there'd be CUDA issues but you guys can't even get it networking?

Looking at above, and generally for WSL instances, they run in their own network on the host, while WSL stuff is all accessible to the host locally by default, it won't see the rest of the network.

You need to configure your WSL instance to be 'bridged networking' not 'host only' networking.

But, assuming you do that, you'll likely still want GPU acceleration which admittedly is something I haven't checked into on WSL in a bit but wasn't really working well last time I did.

Unfortunately Linux nv drivers don't like my dual dGPU+single iGPU+9-12 screen setup (last time I tried, huge issues with performance on screens that weren't the main GPU, like, can't even move a window around bad)

So.. I'll give it a shot, have a Studio M1 Ultra, quad socket server w/512gb ram and my Windows PC.

If I get it working with CUDA as well, I'll post back, but other than pointing out the above I can't help you guys with your network issues - nor are they related to exo (not that I am in any way either, just pointing out that it's not exo's fault you can't access your local network from wsl)

Edit:

I totally jinx'd myself.. CUDA ISSUES!!!!!# YAY

I've spent the last few hours since my initial reply messing around trying to get it up on WSL.

Seemingly got it running with CLANG but, yeah, that's not ideal..

Trying with CUDA, but even the 1B model is failing due to CUDA memory error, at first I thought it was trying my 8GB 2070S GPU first (because exo/topology/device_capabilities.py only enumerates the first device, hard coded) but I see python on both devics actually with nvidia-smi but only see the 12gb 3060 have any difference in memory usage, where it goes upto about 2.5gb (split between shared and dedicated, this is physically second card, shows as device 1 in nvidia-smi but cuda code seems to enumerate opposite)

ForbiddenEra avatar Mar 05 '25 06:03 ForbiddenEra

I have encountered issues between my Macbook M1 and Windows10 WSL-Ubuntu 22, these two machines are on the same local network. And I forwarded ports 5678 and 52415 on Windows10 using netsh interface portproxy add v4tov4 and also opened firewall inbound rules for 5678 (UDP) and 52415 (TCP). The result is still that these two machines cannot discover each other. What is the reason? Thanks!

LGDHuaOPER avatar Mar 16 '25 15:03 LGDHuaOPER