blindcrone
blindcrone
One thing that jumps out at me is that the "side" views currently being used in the cube are the same prompt. Even though the distinction between e.g. "view from...
I can probably contribute at least an aur package to this lol
Might be worth making it a bit more client-servery though, I would like for this to be able to sanely run under systemd at least
1. Aye, like "exo run" this is mostly meant for testing and demo purposes, you'll need to feed it a specific model as it doesn't yet hook up to the...
Since training on exo clusters is a whole new use case, I think the features of it will need to be built out over time, and in conversation with the...
Okay so at this point it seems like removing the abstract base class as part of the requirements for merging this is starting to produce conflicts. I think I've resolved...
Fixed the conflicts AFAICT, hopefully this can be merged now
I like the idea here, but think rather than rely on a folder structure this should use config files or command line arguments to specify paths to model implementations and...
Okay, this now in theory trains across nodes on MLX. I'll need to add the ability to save the weights somewhere to see how well it actually does, and it'd...
I think at some point it would make sense to allow more granular sharding of models than just transformer blocks anyway, and this could involve updating to a memory-footprint heuristic...