Alex Cheema
Alex Cheema
Watching this. I think some users might have run into this issue using exo: https://github.com/exo-explore/exo/issues/152
Thanks for this! This should be a one line change. Please fix conflicts and keep formatting the same as it was.
lgtm after conflicts resolved.
Not a fan of adding a dependency on watchdog for this. We don't need instant config application. Periodically checking every 5 secs for config changes is fine.
Closing in favour of https://github.com/exo-explore/exo/pull/383 Please contact me for a $100 bounty with your ethereum address as I'd like to reward good experimentation: [[email protected]](mailto:[email protected])
good catch. assigned. go for it!
I don't see anything obviously wrong with your setup. It looks all correct. The logs suggest perhaps some networking issues. The fact that it generates some tokens then stops also...
Just updated the README since exo only requires that you have enough memory across all your devices and is designed to run on anything that has a Python runtime.
Was there anything else you had in mind here @shad-ct
Hey, thanks for reporting this. The reason here is likely: ollama is using a 4-bit quantized model. However exo is using the fp16 unquantized model. I have created an issue...