[BUG] Too many open files error when launching lots of instances in succession. Unable to launch instance.
Describe the bug
When launching a lot of instances in succession, we run into Too many open files errors.
To Reproduce
Steps to reproduce the behavior:
- Launch 10 different instances in a row.
- The error will log every 15 seconds.
- Launching an instance results in being stuck in PREPARING state.
Expected behavior
I expect to be able to keep launching as many instances as the cluster can support.
Actual behavior
Instances get stuck in PREPARING state. This error logs every 15 seconds:
[ 12:05:20.6448PM | ERROR ] socket.accept() out of system resource
socket: <asyncio.TransportSocket fd=55, family=2, type=1, proto=0, laddr=('0.0.0.0', 52415)>
Traceback (most recent call last):
File "/Users/s13/exo/.venv/bin/exo", line 10, in <module>
sys.exit(main())
│ │ └ <function main at 0x100f3de40>
│ └ <built-in function exit>
└ <module 'sys' (built-in)>
File "/Users/s13/exo/src/exo/main.py", line 206, in main
anyio.run(node.run)
│ │ │ └ <function Node.run at 0x10937a980>
│ │ └ Node(router=<exo.routing.router.Router object at 0x109434830>, worker=<exo.worker.main.Worker object at 0x1094acf50>, electio...
│ └ <function run at 0x101234b80>
└ <module 'anyio' from '/Users/s13/exo/.venv/lib/python3.13/site-packages/anyio/__init__.py'>
File "/Users/s13/exo/.venv/lib/python3.13/site-packages/anyio/_core/_eventloop.py", line 74, in run
return async_backend.run(func, args, {}, backend_options)
│ │ │ │ └ {}
│ │ │ └ ()
│ │ └ <bound method Node.run of Node(router=<exo.routing.router.Router object at 0x109434830>, worker=<exo.worker.main.Worker objec...
│ └ <classmethod(<function AsyncIOBackend.run at 0x109461440>)>
└ <class 'anyio._backends._asyncio.AsyncIOBackend'>
File "/Users/s13/exo/.venv/lib/python3.13/site-packages/anyio/_backends/_asyncio.py", line 2325, in run
return runner.run(wrapper())
│ │ └ <function Node.run at 0x1094631a0>
│ └ <function Runner.run at 0x1015b8540>
└ <asyncio.runners.Runner object at 0x1094ac050>
File "/Users/s13/.local/share/uv/python/cpython-3.13.7-macos-aarch64-none/lib/python3.13/asyncio/runners.py", line 118, in run
return self._loop.run_until_complete(task)
│ │ │ └ <Task pending name='exo.main.Node.run' coro=<Node.run() running at /Users/s13/exo/.venv/lib/python3.13/site-packages/anyio/_b...
│ │ └ <function BaseEventLoop.run_until_complete at 0x1015b5f80>
│ └ <_UnixSelectorEventLoop running=True closed=False debug=False>
└ <asyncio.runners.Runner object at 0x1094ac050>
File "/Users/s13/.local/share/uv/python/cpython-3.13.7-macos-aarch64-none/lib/python3.13/asyncio/base_events.py", line 712, in run_until_complete
self.run_forever()
│ └ <function BaseEventLoop.run_forever at 0x1015b5ee0>
└ <_UnixSelectorEventLoop running=True closed=False debug=False>
File "/Users/s13/.local/share/uv/python/cpython-3.13.7-macos-aarch64-none/lib/python3.13/asyncio/base_events.py", line 683, in run_forever
self._run_once()
│ └ <function BaseEventLoop._run_once at 0x1015b7ce0>
└ <_UnixSelectorEventLoop running=True closed=False debug=False>
File "/Users/s13/.local/share/uv/python/cpython-3.13.7-macos-aarch64-none/lib/python3.13/asyncio/base_events.py", line 2050, in _run_once
handle._run()
│ └ <function Handle._run at 0x1015394e0>
└ <Handle BaseSelectorEventLoop._accept_connection()>
File "/Users/s13/.local/share/uv/python/cpython-3.13.7-macos-aarch64-none/lib/python3.13/asyncio/events.py", line 89, in _run
self._context.run(self._callback, *self._args)
│ │ │ │ │ └ <member '_args' of 'Handle' objects>
│ │ │ │ └ <Handle BaseSelectorEventLoop._accept_connection()>
│ │ │ └ <member '_callback' of 'Handle' objects>
│ │ └ <Handle BaseSelectorEventLoop._accept_connection()>
│ └ <member '_context' of 'Handle' objects>
└ <Handle BaseSelectorEventLoop._accept_connection()>
> File "/Users/s13/.local/share/uv/python/cpython-3.13.7-macos-aarch64-none/lib/python3.13/asyncio/selector_events.py", line 178, in _accept_connection
conn, addr = sock.accept()
│ └ <function socket.accept at 0x101109800>
└ <socket.socket fd=55, family=2, type=1, proto=0, laddr=('0.0.0.0', 52415)>
File "/Users/s13/.local/share/uv/python/cpython-3.13.7-macos-aarch64-none/lib/python3.13/socket.py", line 295, in accept
fd, addr = self._accept()
│ └ <method '_accept' of '_socket.socket' objects>
└ <socket.socket fd=55, family=2, type=1, proto=0, laddr=('0.0.0.0', 52415)>
OSError: [Errno 24] Too many open files
Environment
- macOS Version: 26.3
- EXO Version: Latest main
59e7594e3412a3164caa7de5d92416ec542fd67a - Hardware:
- 2 x 512GB M3 Ultra Mac Studio
- Interconnection:
- TB5 + Ethernet (all-to-all)
When you leave this long enough, eventually we get this error too and all nodes drop out of the topology completely:
[ 12:20:32.3693PM | ERROR ] Error in ASGI Framework
Traceback (most recent call last):
File "/Users/s13/exo/.venv/bin/exo", line 10, in <module>
sys.exit(main())
│ │ └ <function main at 0x100f3de40>
│ └ <built-in function exit>
└ <module 'sys' (built-in)>
File "/Users/s13/exo/src/exo/main.py", line 206, in main
anyio.run(node.run)
│ │ │ └ <function Node.run at 0x10937a980>
│ │ └ Node(router=<exo.routing.router.Router object at 0x109434830>, worker=<exo.worker.main.Worker object at 0x1094acf50>, electio...
│ └ <function run at 0x101234b80>
└ <module 'anyio' from '/Users/s13/exo/.venv/lib/python3.13/site-packages/anyio/__init__.py'>
File "/Users/s13/exo/.venv/lib/python3.13/site-packages/anyio/_core/_eventloop.py", line 74, in run
return async_backend.run(func, args, {}, backend_options)
│ │ │ │ └ {}
│ │ │ └ ()
│ │ └ <bound method Node.run of Node(router=<exo.routing.router.Router object at 0x109434830>, worker=<exo.worker.main.Worker objec...
│ └ <classmethod(<function AsyncIOBackend.run at 0x109461440>)>
└ <class 'anyio._backends._asyncio.AsyncIOBackend'>
File "/Users/s13/exo/.venv/lib/python3.13/site-packages/anyio/_backends/_asyncio.py", line 2325, in run
return runner.run(wrapper())
│ │ └ <function Node.run at 0x1094631a0>
│ └ <function Runner.run at 0x1015b8540>
└ <asyncio.runners.Runner object at 0x1094ac050>
File "/Users/s13/.local/share/uv/python/cpython-3.13.7-macos-aarch64-none/lib/python3.13/asyncio/runners.py", line 118, in run
return self._loop.run_until_complete(task)
│ │ │ └ <Task pending name='exo.main.Node.run' coro=<Node.run() running at /Users/s13/exo/.venv/lib/python3.13/site-packages/anyio/_b...
│ │ └ <function BaseEventLoop.run_until_complete at 0x1015b5f80>
│ └ <_UnixSelectorEventLoop running=True closed=False debug=False>
└ <asyncio.runners.Runner object at 0x1094ac050>
File "/Users/s13/.local/share/uv/python/cpython-3.13.7-macos-aarch64-none/lib/python3.13/asyncio/base_events.py", line 712, in run_until_complete
self.run_forever()
│ └ <function BaseEventLoop.run_forever at 0x1015b5ee0>
└ <_UnixSelectorEventLoop running=True closed=False debug=False>
File "/Users/s13/.local/share/uv/python/cpython-3.13.7-macos-aarch64-none/lib/python3.13/asyncio/base_events.py", line 683, in run_forever
self._run_once()
│ └ <function BaseEventLoop._run_once at 0x1015b7ce0>
└ <_UnixSelectorEventLoop running=True closed=False debug=False>
File "/Users/s13/.local/share/uv/python/cpython-3.13.7-macos-aarch64-none/lib/python3.13/asyncio/base_events.py", line 2050, in _run_once
handle._run()
│ └ <function Handle._run at 0x1015394e0>
└ <Handle Task.task_wakeup()>
File "/Users/s13/.local/share/uv/python/cpython-3.13.7-macos-aarch64-none/lib/python3.13/asyncio/events.py", line 89, in _run
self._context.run(self._callback, *self._args)
│ │ │ │ │ └ <member '_args' of 'Handle' objects>
│ │ │ │ └ <Handle Task.task_wakeup()>
│ │ │ └ <member '_callback' of 'Handle' objects>
│ │ └ <Handle Task.task_wakeup()>
│ └ <member '_context' of 'Handle' objects>
└ <Handle Task.task_wakeup()>
> File "/Users/s13/exo/.venv/lib/python3.13/site-packages/hypercorn/asyncio/task_group.py", line 28, in _handle
await app(scope, receive, send, sync_spawn, call_soon)
│ │ │ │ │ └ <function TaskGroup.spawn_app.<locals>._call_soon at 0x10f399300>
│ │ │ │ └ functools.partial(<bound method BaseEventLoop.run_in_executor of <_UnixSelectorEventLoop running=True closed=False debug=Fals...
│ │ │ └ <bound method HTTPStream.app_send of <hypercorn.protocol.http_stream.HTTPStream object at 0x10f04c2c0>>
│ │ └ <bound method Queue.get of <Queue at 0x10eaca060 maxsize=10 _queue=[{'type': 'http.request', 'body': b'', 'more_body': False}...
│ └ {'type': 'http', 'http_version': '1.1', 'asgi': {'spec_version': '2.1', 'version': '3.0'}, 'method': 'GET', 'scheme': 'http',...
└ <hypercorn.app_wrappers.ASGIWrapper object at 0x1091f70e0>
File "/Users/s13/exo/.venv/lib/python3.13/site-packages/hypercorn/app_wrappers.py", line 34, in __call__
await self.app(scope, receive, send)
│ │ │ │ └ <bound method HTTPStream.app_send of <hypercorn.protocol.http_stream.HTTPStream object at 0x10f04c2c0>>
│ │ │ └ <bound method Queue.get of <Queue at 0x10eaca060 maxsize=10 _queue=[{'type': 'http.request', 'body': b'', 'more_body': False}...
│ │ └ {'type': 'http', 'http_version': '1.1', 'asgi': {'spec_version': '2.1', 'version': '3.0'}, 'method': 'GET', 'scheme': 'http',...
│ └ <fastapi.applications.FastAPI object at 0x109435550>
└ <hypercorn.app_wrappers.ASGIWrapper object at 0x1091f70e0>
File "/Users/s13/exo/.venv/lib/python3.13/site-packages/fastapi/applications.py", line 1134, in __call__
await super().__call__(scope, receive, send)
│ │ └ <bound method HTTPStream.app_send of <hypercorn.protocol.http_stream.HTTPStream object at 0x10f04c2c0>>
│ └ <bound method Queue.get of <Queue at 0x10eaca060 maxsize=10 _queue=[{'type': 'http.request', 'body': b'', 'more_body': False}...
└ {'type': 'http', 'http_version': '1.1', 'asgi': {'spec_version': '2.1', 'version': '3.0'}, 'method': 'GET', 'scheme': 'http',...
File "/Users/s13/exo/.venv/lib/python3.13/site-packages/starlette/applications.py", line 113, in __call__
await self.middleware_stack(scope, receive, send)
│ │ │ │ └ <bound method HTTPStream.app_send of <hypercorn.protocol.http_stream.HTTPStream object at 0x10f04c2c0>>
│ │ │ └ <bound method Queue.get of <Queue at 0x10eaca060 maxsize=10 _queue=[{'type': 'http.request', 'body': b'', 'more_body': False}...
│ │ └ {'type': 'http', 'http_version': '1.1', 'asgi': {'spec_version': '2.1', 'version': '3.0'}, 'method': 'GET', 'scheme': 'http',...
│ └ <starlette.middleware.errors.ServerErrorMiddleware object at 0x109437770>
└ <fastapi.applications.FastAPI object at 0x109435550>
File "/Users/s13/exo/.venv/lib/python3.13/site-packages/starlette/middleware/errors.py", line 186, in __call__
raise exc
File "/Users/s13/exo/.venv/lib/python3.13/site-packages/starlette/middleware/errors.py", line 164, in __call__
await self.app(scope, receive, _send)
│ │ │ │ └ <function ServerErrorMiddleware.__call__.<locals>._send at 0x10f39ba60>
│ │ │ └ <bound method Queue.get of <Queue at 0x10eaca060 maxsize=10 _queue=[{'type': 'http.request', 'body': b'', 'more_body': False}...
│ │ └ {'type': 'http', 'http_version': '1.1', 'asgi': {'spec_version': '2.1', 'version': '3.0'}, 'method': 'GET', 'scheme': 'http',...
│ └ <starlette.middleware.cors.CORSMiddleware object at 0x109437620>
└ <starlette.middleware.errors.ServerErrorMiddleware object at 0x109437770>
File "/Users/s13/exo/.venv/lib/python3.13/site-packages/starlette/middleware/cors.py", line 93, in __call__
await self.simple_response(scope, receive, send, request_headers=headers)
│ │ │ │ │ └ Headers({'host': 's13:52415', 'user-agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/605.1.15 (KHTML, lik...
│ │ │ │ └ <function ServerErrorMiddleware.__call__.<locals>._send at 0x10f39ba60>
│ │ │ └ <bound method Queue.get of <Queue at 0x10eaca060 maxsize=10 _queue=[{'type': 'http.request', 'body': b'', 'more_body': False}...
│ │ └ {'type': 'http', 'http_version': '1.1', 'asgi': {'spec_version': '2.1', 'version': '3.0'}, 'method': 'GET', 'scheme': 'http',...
│ └ <function CORSMiddleware.simple_response at 0x1080c3a60>
└ <starlette.middleware.cors.CORSMiddleware object at 0x109437620>
File "/Users/s13/exo/.venv/lib/python3.13/site-packages/starlette/middleware/cors.py", line 144, in simple_response
await self.app(scope, receive, send)
│ │ │ │ └ functools.partial(<bound method CORSMiddleware.send of <starlette.middleware.cors.CORSMiddleware object at 0x109437620>>, sen...
│ │ │ └ <bound method Queue.get of <Queue at 0x10eaca060 maxsize=10 _queue=[{'type': 'http.request', 'body': b'', 'more_body': False}...
│ │ └ {'type': 'http', 'http_version': '1.1', 'asgi': {'spec_version': '2.1', 'version': '3.0'}, 'method': 'GET', 'scheme': 'http',...
│ └ <starlette.middleware.exceptions.ExceptionMiddleware object at 0x1094374d0>
└ <starlette.middleware.cors.CORSMiddleware object at 0x109437620>
File "/Users/s13/exo/.venv/lib/python3.13/site-packages/starlette/middleware/exceptions.py", line 63, in __call__
await wrap_app_handling_exceptions(self.app, conn)(scope, receive, send)
│ │ │ │ │ │ └ functools.partial(<bound method CORSMiddleware.send of <starlette.middleware.cors.CORSMiddleware object at 0x109437620>>, sen...
│ │ │ │ │ └ <bound method Queue.get of <Queue at 0x10eaca060 maxsize=10 _queue=[{'type': 'http.request', 'body': b'', 'more_body': False}...
│ │ │ │ └ {'type': 'http', 'http_version': '1.1', 'asgi': {'spec_version': '2.1', 'version': '3.0'}, 'method': 'GET', 'scheme': 'http',...
│ │ │ └ <starlette.requests.Request object at 0x10f399d10>
│ │ └ <fastapi.middleware.asyncexitstack.AsyncExitStackMiddleware object at 0x109437380>
│ └ <starlette.middleware.exceptions.ExceptionMiddleware object at 0x1094374d0>
└ <function wrap_app_handling_exceptions at 0x10805efc0>
File "/Users/s13/exo/.venv/lib/python3.13/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app
raise exc
File "/Users/s13/exo/.venv/lib/python3.13/site-packages/starlette/_exception_handler.py", line 42, in wrapped_app
await app(scope, receive, sender)
│ │ │ └ <function wrap_app_handling_exceptions.<locals>.wrapped_app.<locals>.sender at 0x10f3989a0>
│ │ └ <bound method Queue.get of <Queue at 0x10eaca060 maxsize=10 _queue=[{'type': 'http.request', 'body': b'', 'more_body': False}...
│ └ {'type': 'http', 'http_version': '1.1', 'asgi': {'spec_version': '2.1', 'version': '3.0'}, 'method': 'GET', 'scheme': 'http',...
└ <fastapi.middleware.asyncexitstack.AsyncExitStackMiddleware object at 0x109437380>
File "/Users/s13/exo/.venv/lib/python3.13/site-packages/fastapi/middleware/asyncexitstack.py", line 18, in __call__
await self.app(scope, receive, send)
│ │ │ │ └ <function wrap_app_handling_exceptions.<locals>.wrapped_app.<locals>.sender at 0x10f3989a0>
│ │ │ └ <bound method Queue.get of <Queue at 0x10eaca060 maxsize=10 _queue=[{'type': 'http.request', 'body': b'', 'more_body': False}...
│ │ └ {'type': 'http', 'http_version': '1.1', 'asgi': {'spec_version': '2.1', 'version': '3.0'}, 'method': 'GET', 'scheme': 'http',...
│ └ <fastapi.routing.APIRouter object at 0x1093f96d0>
└ <fastapi.middleware.asyncexitstack.AsyncExitStackMiddleware object at 0x109437380>
File "/Users/s13/exo/.venv/lib/python3.13/site-packages/starlette/routing.py", line 716, in __call__
await self.middleware_stack(scope, receive, send)
│ │ │ │ └ <function wrap_app_handling_exceptions.<locals>.wrapped_app.<locals>.sender at 0x10f3989a0>
│ │ │ └ <bound method Queue.get of <Queue at 0x10eaca060 maxsize=10 _queue=[{'type': 'http.request', 'body': b'', 'more_body': False}...
│ │ └ {'type': 'http', 'http_version': '1.1', 'asgi': {'spec_version': '2.1', 'version': '3.0'}, 'method': 'GET', 'scheme': 'http',...
│ └ <bound method Router.app of <fastapi.routing.APIRouter object at 0x1093f96d0>>
└ <fastapi.routing.APIRouter object at 0x1093f96d0>
File "/Users/s13/exo/.venv/lib/python3.13/site-packages/starlette/routing.py", line 736, in app
await route.handle(scope, receive, send)
│ │ │ │ └ <function wrap_app_handling_exceptions.<locals>.wrapped_app.<locals>.sender at 0x10f3989a0>
│ │ │ └ <bound method Queue.get of <Queue at 0x10eaca060 maxsize=10 _queue=[{'type': 'http.request', 'body': b'', 'more_body': False}...
│ │ └ {'type': 'http', 'http_version': '1.1', 'asgi': {'spec_version': '2.1', 'version': '3.0'}, 'method': 'GET', 'scheme': 'http',...
│ └ <function Mount.handle at 0x108089080>
└ Mount(path='', name='dashboard', app=<starlette.staticfiles.StaticFiles object at 0x1094363c0>)
File "/Users/s13/exo/.venv/lib/python3.13/site-packages/starlette/routing.py", line 462, in handle
await self.app(scope, receive, send)
│ │ │ │ └ <function wrap_app_handling_exceptions.<locals>.wrapped_app.<locals>.sender at 0x10f3989a0>
│ │ │ └ <bound method Queue.get of <Queue at 0x10eaca060 maxsize=10 _queue=[{'type': 'http.request', 'body': b'', 'more_body': False}...
│ │ └ {'type': 'http', 'http_version': '1.1', 'asgi': {'spec_version': '2.1', 'version': '3.0'}, 'method': 'GET', 'scheme': 'http',...
│ └ <starlette.staticfiles.StaticFiles object at 0x1094363c0>
└ Mount(path='', name='dashboard', app=<starlette.staticfiles.StaticFiles object at 0x1094363c0>)
File "/Users/s13/exo/.venv/lib/python3.13/site-packages/starlette/staticfiles.py", line 99, in __call__
await response(scope, receive, send)
│ │ │ └ <function wrap_app_handling_exceptions.<locals>.wrapped_app.<locals>.sender at 0x10f3989a0>
│ │ └ <bound method Queue.get of <Queue at 0x10eaca060 maxsize=10 _queue=[{'type': 'http.request', 'body': b'', 'more_body': False}...
│ └ {'type': 'http', 'http_version': '1.1', 'asgi': {'spec_version': '2.1', 'version': '3.0'}, 'method': 'GET', 'scheme': 'http',...
└ <starlette.responses.FileResponse object at 0x10eacb9b0>
File "/Users/s13/exo/.venv/lib/python3.13/site-packages/starlette/responses.py", line 365, in __call__
await self._handle_simple(send, send_header_only, send_pathsend)
│ │ │ │ └ False
│ │ │ └ False
│ │ └ <function wrap_app_handling_exceptions.<locals>.wrapped_app.<locals>.sender at 0x10f3989a0>
│ └ <function FileResponse._handle_simple at 0x1080528e0>
└ <starlette.responses.FileResponse object at 0x10eacb9b0>
File "/Users/s13/exo/.venv/lib/python3.13/site-packages/starlette/responses.py", line 391, in _handle_simple
async with await anyio.open_file(self.path, mode="rb") as file:
│ │ │ └ '/Users/s13/exo/dashboard/build/_app/immutable/chunks/CsenEIL4.js'
│ │ └ <starlette.responses.FileResponse object at 0x10eacb9b0>
│ └ <function open_file at 0x1013c1f80>
└ <module 'anyio' from '/Users/s13/exo/.venv/lib/python3.13/site-packages/anyio/__init__.py'>
File "/Users/s13/exo/.venv/lib/python3.13/site-packages/anyio/_core/_fileio.py", line 187, in open_file
fp = await to_thread.run_sync(
│ └ <function run_sync at 0x101299260>
└ <module 'anyio.to_thread' from '/Users/s13/exo/.venv/lib/python3.13/site-packages/anyio/to_thread.py'>
File "/Users/s13/exo/.venv/lib/python3.13/site-packages/anyio/to_thread.py", line 56, in run_sync
return await get_async_backend().run_sync_in_worker_thread(
└ <function get_async_backend at 0x1012363e0>
File "/Users/s13/exo/.venv/lib/python3.13/site-packages/anyio/_backends/_asyncio.py", line 2485, in run_sync_in_worker_thread
return await future
└ <Future finished exception=OSError(24, 'Too many open files')>
File "/Users/s13/exo/.venv/lib/python3.13/site-packages/anyio/_backends/_asyncio.py", line 976, in run
result = context.run(func, *args)
OSError: [Errno 24] Too many open files: '/Users/s13/exo/dashboard/build/_app/immutable/chunks/CsenEIL4.js'
State:
{"instances":{},"runners":{"614867ad-ff63-49e4-8cca-8667d98ca993":{"RunnerReady":{}},"f9fe81a2-6c59-4b4d-bab3-6777de53636f":{"RunnerReady":{}},"46b3c4f0-0d16-4d8e-af79-d438c7b93ab7":{"RunnerReady":{}},"9f72771c-f83a-4334-bd49-da66521cf7d2":{"RunnerReady":{}},"cc42afcc-dafe-46c4-ac95-50b9e722652c":{"RunnerLoading":{}},"f153a3c7-c2f4-43e4-9d5d-8102c1981fcf":{"RunnerLoading":{}},"19acb176-9402-4178-97ec-42dabb766cff":{"RunnerReady":{}},"86a00a02-a097-4f42-92a4-ebb4aabfa782":{"RunnerReady":{}},"dbfe86a9-bc21-4ff6-abf8-aa6c03ad4466":{"RunnerWarmingUp":{}},"ddf82dda-6061-41e8-9749-1c749365abd5":{"RunnerWarmingUp":{}},"116aa6aa-724b-4c3e-95c4-fe2d7054baf3":{"RunnerLoading":{}},"1ca7f6ed-54aa-48d3-a3a0-cb10993c7204":{"RunnerLoading":{}},"63630962-f375-454a-bf4d-ecd4980400f4":{"RunnerReady":{}},"16c2f841-ec62-451b-9272-30b9448651e4":{"RunnerReady":{}},"d9e97a79-8e17-4d3b-9d33-4910077fd3fd":{"RunnerIdle":{}}},"downloads":{"12D3KooWRm5FoCfAX5uqCDa4Ky9DFQRty514QJLQnVwWHJFTqa4P":[{"DownloadCompleted":{"nodeId":"12D3KooWRm5FoCfAX5uqCDa4Ky9DFQRty514QJLQnVwWHJFTqa4P","shardMetadata":{"PipelineShardMetadata":{"modelMeta":{"modelId":"mlx-community/Llama-3.2-1B-Instruct-4bit","prettyName":"Llama 3.2 1B (4-bit)","storageSize":{"inBytes":695242752},"nLayers":16,"hiddenSize":2048,"supportsTensor":true},"deviceRank":0,"worldSize":1,"immediateException":false,"shouldTimeout":null,"startLayer":0,"endLayer":16,"nLayers":16}}}},{"DownloadPending":{"nodeId":"12D3KooWRm5FoCfAX5uqCDa4Ky9DFQRty514QJLQnVwWHJFTqa4P","shardMetadata":{"PipelineShardMetadata":{"modelMeta":{"modelId":"mlx-community/Llama-3.2-3B-Instruct-4bit","prettyName":"Llama 3.2 3B (4-bit)","storageSize":{"inBytes":1807423488},"nLayers":28,"hiddenSize":3072,"supportsTensor":true},"deviceRank":0,"worldSize":1,"immediateException":false,"shouldTimeout":null,"startLayer":0,"endLayer":28,"nLayers":28}}}},{"DownloadPending":{"nodeId":"12D3KooWRm5FoCfAX5uqCDa4Ky9DFQRty514QJLQnVwWHJFTqa4P","shardMetadata":{"PipelineShardMetadata":{"modelMeta":{"modelId":"mlx-community/Meta-Llama-3.1-8B-Instruct-4bit","prettyName":"Llama 3.1 8B (4-bit)","storageSize":{"inBytes":4517404672},"nLayers":32,"hiddenSize":4096,"supportsTensor":true},"deviceRank":0,"worldSize":1,"immediateException":false,"shouldTimeout":null,"startLayer":0,"endLayer":32,"nLayers":32}}}},{"DownloadPending":{"nodeId":"12D3KooWRm5FoCfAX5uqCDa4Ky9DFQRty514QJLQnVwWHJFTqa4P","shardMetadata":{"PipelineShardMetadata":{"modelMeta":{"modelId":"mlx-community/Llama-3.2-3B-Instruct-8bit","prettyName":"Llama 3.2 3B (8-bit)","storageSize":{"inBytes":3413710848},"nLayers":28,"hiddenSize":3072,"supportsTensor":true},"deviceRank":0,"worldSize":1,"immediateException":false,"shouldTimeout":null,"startLayer":0,"endLayer":28,"nLayers":28}}}},{"DownloadPending":{"nodeId":"12D3KooWRm5FoCfAX5uqCDa4Ky9DFQRty514QJLQnVwWHJFTqa4P","shardMetadata":{"PipelineShardMetadata":{"modelMeta":{"modelId":"mlx-community/Meta-Llama-3.1-8B-Instruct-8bit","prettyName":"Llama 3.1 8B (8-bit)","storageSize":{"inBytes":8532402176},"nLayers":32,"hiddenSize":4096,"supportsTensor":true},"deviceRank":0,"worldSize":1,"immediateException":false,"shouldTimeout":null,"startLayer":0,"endLayer":32,"nLayers":32}}}},{"DownloadCompleted":{"nodeId":"12D3KooWRm5FoCfAX5uqCDa4Ky9DFQRty514QJLQnVwWHJFTqa4P","shardMetadata":{"PipelineShardMetadata":{"modelMeta":{"modelId":"mlx-community/Qwen3-0.6B-4bit","prettyName":"Qwen3 0.6B (4-bit)","storageSize":{"inBytes":335372288},"nLayers":28,"hiddenSize":1024,"supportsTensor":false},"deviceRank":0,"worldSize":1,"immediateException":false,"shouldTimeout":null,"startLayer":0,"endLayer":28,"nLayers":28}}}},{"DownloadPending":{"nodeId":"12D3KooWRm5FoCfAX5uqCDa4Ky9DFQRty514QJLQnVwWHJFTqa4P","shardMetadata":{"PipelineShardMetadata":{"modelMeta":{"modelId":"mlx-community/gpt-oss-20b-MXFP4-Q4","prettyName":"GPT-OSS 20B (MXFP4-Q4, MLX)","storageSize":{"inBytes":11178480768},"nLayers":24,"hiddenSize":2880,"supportsTensor":true},"deviceRank":0,"worldSize":1,"immediateException":false,"shouldTimeout":null,"startLayer":0,"endLayer":24,"nLayers":24}}}},{"DownloadPending":{"nodeId":"12D3KooWRm5FoCfAX5uqCDa4Ky9DFQRty514QJLQnVwWHJFTqa4P","shardMetadata":{"PipelineShardMetadata":{"modelMeta":{"modelId":"mlx-community/Qwen3-0.6B-8bit","prettyName":"Qwen3 0.6B (8-bit)","storageSize":{"inBytes":633364480},"nLayers":28,"hiddenSize":1024,"supportsTensor":false},"deviceRank":0,"worldSize":1,"immediateException":false,"shouldTimeout":null,"startLayer":0,"endLayer":28,"nLayers":28}}}},{"DownloadPending":{"nodeId":"12D3KooWRm5FoCfAX5uqCDa4Ky9DFQRty514QJLQnVwWHJFTqa4P","shardMetadata":{"PipelineShardMetadata":{"modelMeta":{"modelId":"mlx-community/Meta-Llama-3.1-8B-Instruct-bf16","prettyName":"Llama 3.1 8B (BF16)","storageSize":{"inBytes":16060522496},"nLayers":32,"hiddenSize":4096,"supportsTensor":true},"deviceRank":0,"worldSize":1,"immediateException":false,"shouldTimeout":null,"startLayer":0,"endLayer":32,"nLayers":32}}}},{"DownloadPending":{"nodeId":"12D3KooWRm5FoCfAX5uqCDa4Ky9DFQRty514QJLQnVwWHJFTqa4P","shardMetadata":{"PipelineShardMetadata":{"modelMeta":{"modelId":"mlx-community/Meta-Llama-3.1-70B-Instruct-4bit","prettyName":"Llama 3.1 70B (4-bit)","storageSize":{"inBytes":39688355840},"nLayers":80,"hiddenSize":8192,"supportsTensor":true},"deviceRank":0,"worldSize":1,"immediateException":false,"shouldTimeout":null,"startLayer":0,"endLayer":80,"nLayers":80}}}},{"DownloadCompleted":{"nodeId":"12D3KooWRm5FoCfAX5uqCDa4Ky9DFQRty514QJLQnVwWHJFTqa4P","shardMetadata":{"PipelineShardMetadata":{"modelMeta":{"modelId":"mlx-community/Qwen3-30B-A3B-4bit","prettyName":"Qwen3 30B A3B (4-bit)","storageSize":{"inBytes":17174622208},"nLayers":48,"hiddenSize":2048,"supportsTensor":true},"deviceRank":0,"worldSize":1,"immediateException":false,"shouldTimeout":null,"startLayer":0,"endLayer":48,"nLayers":48}}}},{"DownloadPending":{"nodeId":"12D3KooWRm5FoCfAX5uqCDa4Ky9DFQRty514QJLQnVwWHJFTqa4P","shardMetadata":{"PipelineShardMetadata":{"modelMeta":{"modelId":"mlx-community/Llama-3.3-70B-Instruct-4bit","prettyName":"Llama 3.3 70B (4-bit)","storageSize":{"inBytes":39688355840},"nLayers":80,"hiddenSize":8192,"supportsTensor":true},"deviceRank":0,"worldSize":1,"immediateException":false,"shouldTimeout":null,"startLayer":0,"endLayer":80,"nLayers":80}}}},{"DownloadPending":{"nodeId":"12D3KooWRm5FoCfAX5uqCDa4Ky9DFQRty514QJLQnVwWHJFTqa4P","shardMetadata":{"PipelineShardMetadata":{"modelMeta":{"modelId":"mlx-community/Qwen3-30B-A3B-8bit","prettyName":"Qwen3 30B A3B (8-bit)","storageSize":{"inBytes":32440578048},"nLayers":48,"hiddenSize":2048,"supportsTensor":true},"deviceRank":0,"worldSize":1,"immediateException":false,"shouldTimeout":null,"startLayer":0,"endLayer":48,"nLayers":48}}}},{"DownloadPending":{"nodeId":"12D3KooWRm5FoCfAX5uqCDa4Ky9DFQRty514QJLQnVwWHJFTqa4P","shardMetadata":{"PipelineShardMetadata":{"modelMeta":{"modelId":"mlx-community/Llama-3.3-70B-Instruct-8bit","prettyName":"Llama 3.3 70B (8-bit)","storageSize":{"inBytes":74964549632},"nLayers":80,"hiddenSize":8192,"supportsTensor":true},"deviceRank":0,"worldSize":1,"immediateException":false,"shouldTimeout":null,"startLayer":0,"endLayer":80,"nLayers":80}}}},{"DownloadPending":{"nodeId":"12D3KooWRm5FoCfAX5uqCDa4Ky9DFQRty514QJLQnVwWHJFTqa4P","shardMetadata":{"PipelineShardMetadata":{"modelMeta":{"modelId":"mlx-community/Qwen3-Next-80B-A3B-Thinking-4bit","prettyName":"Qwen3 80B A3B Thinking (4-bit)","storageSize":{"inBytes":44844060160},"nLayers":48,"hiddenSize":2048,"supportsTensor":true},"deviceRank":0,"worldSize":1,"immediateException":false,"shouldTimeout":null,"startLayer":0,"endLayer":48,"nLayers":48}}}},{"DownloadPending":{"nodeId":"12D3KooWRm5FoCfAX5uqCDa4Ky9DFQRty514QJLQnVwWHJFTqa4P","shardMetadata":{"PipelineShardMetadata":{"modelMeta":{"modelId":"mlx-community/gpt-oss-120b-MXFP4-Q8","prettyName":"GPT-OSS 120B (MXFP4-Q8, MLX)","storageSize":{"inBytes":63386815104},"nLayers":36,"hiddenSize":2880,"supportsTensor":true},"deviceRank":0,"worldSize":1,"immediateException":false,"shouldTimeout":null,"startLayer":0,"endLayer":36,"nLayers":36}}}},{"DownloadPending":{"nodeId":"12D3KooWRm5FoCfAX5uqCDa4Ky9DFQRty514QJLQnVwWHJFTqa4P","shardMetadata":{"PipelineShardMetadata":{"modelMeta":{"modelId":"mlx-community/Qwen3-Next-80B-A3B-Instruct-4bit","prettyName":"Qwen3 80B A3B (4-bit)","storageSize":{"inBytes":44844060160},"nLayers":48,"hiddenSize":2048,"supportsTensor":true},"deviceRank":0,"worldSize":1,"immediateException":false,"shouldTimeout":null,"startLayer":0,"endLayer":48,"nLayers":48}}}},{"DownloadPending":{"nodeId":"12D3KooWRm5FoCfAX5uqCDa4Ky9DFQRty514QJLQnVwWHJFTqa4P","shardMetadata":{"PipelineShardMetadata":{"modelMeta":{"modelId":"mlx-community/Qwen3-Next-80B-A3B-Thinking-8bit","prettyName":"Qwen3 80B A3B Thinking (8-bit)","storageSize":{"inBytes":84655345152},"nLayers":48,"hiddenSize":2048,"supportsTensor":true},"deviceRank":0,"worldSize":1,"immediateException":false,"shouldTimeout":null,"startLayer":0,"endLayer":48,"nLayers":48}}}},{"DownloadPending":{"nodeId":"12D3KooWRm5FoCfAX5uqCDa4Ky9DFQRty514QJLQnVwWHJFTqa4P","shardMetadata":{"PipelineShardMetadata":{"modelMeta":{"modelId":"mlx-community/Qwen3-Next-80B-A3B-Instruct-8bit","prettyName":"Qwen3 80B A3B (8-bit)","storageSize":{"inBytes":84655345152},"nLayers":48,"hiddenSize":2048,"supportsTensor":true},"deviceRank":0,"worldSize":1,"immediateException":false,"shouldTimeout":null,"startLayer":0,"endLayer":48,"nLayers":48}}}},{"DownloadPending":{"nodeId":"12D3KooWRm5FoCfAX5uqCDa4Ky9DFQRty514QJLQnVwWHJFTqa4P","shardMetadata":{"PipelineShardMetadata":{"modelMeta":{"modelId":"mlx-community/llama-3.3-70b-instruct-fp16","prettyName":"Llama 3.3 70B (FP16)","storageSize":{"inBytes":141107412992},"nLayers":80,"hiddenSize":8192,"supportsTensor":true},"deviceRank":0,"worldSize":1,"immediateException":false,"shouldTimeout":null,"startLayer":0,"endLayer":80,"nLayers":80}}}},{"DownloadPending":{"nodeId":"12D3KooWRm5FoCfAX5uqCDa4Ky9DFQRty514QJLQnVwWHJFTqa4P","shardMetadata":{"PipelineShardMetadata":{"modelMeta":{"modelId":"mlx-community/GLM-4.5-Air-8bit","prettyName":"GLM 4.5 Air 8bit","storageSize":{"inBytes":113553627648},"nLayers":46,"hiddenSize":4096,"supportsTensor":false},"deviceRank":0,"worldSize":1,"immediateException":false,"shouldTimeout":null,"startLayer":0,"endLayer":46,"nLayers":46}}}},{"DownloadPending":{"nodeId":"12D3KooWRm5FoCfAX5uqCDa4Ky9DFQRty514QJLQnVwWHJFTqa4P","shardMetadata":{"PipelineShardMetadata":{"modelMeta":{"modelId":"mlx-community/Qwen3-235B-A22B-Instruct-2507-4bit","prettyName":"Qwen3 235B A22B (4-bit)","storageSize":{"inBytes":132241316864},"nLayers":94,"hiddenSize":4096,"supportsTensor":true},"deviceRank":0,"worldSize":1,"immediateException":false,"shouldTimeout":null,"startLayer":0,"endLayer":94,"nLayers":94}}}},{"DownloadPending":{"nodeId":"12D3KooWRm5FoCfAX5uqCDa4Ky9DFQRty514QJLQnVwWHJFTqa4P","shardMetadata":{"PipelineShardMetadata":{"modelMeta":{"modelId":"mlx-community/GLM-4.5-Air-bf16","prettyName":"GLM 4.5 Air bf16","storageSize":{"inBytes":213704514048},"nLayers":46,"hiddenSize":4096,"supportsTensor":true},"deviceRank":0,"worldSize":1,"immediateException":false,"shouldTimeout":null,"startLayer":0,"endLayer":46,"nLayers":46}}}},{"DownloadPending":{"nodeId":"12D3KooWRm5FoCfAX5uqCDa4Ky9DFQRty514QJLQnVwWHJFTqa4P","shardMetadata":{"PipelineShardMetadata":{"modelMeta":{"modelId":"mlx-community/Qwen3-235B-A22B-Instruct-2507-8bit","prettyName":"Qwen3 235B A22B (8-bit)","storageSize":{"inBytes":249787735040},"nLayers":94,"hiddenSize":4096,"supportsTensor":true},"deviceRank":0,"worldSize":1,"immediateException":false,"shouldTimeout":null,"startLayer":0,"endLayer":94,"nLayers":94}}}},{"DownloadPending":{"nodeId":"12D3KooWRm5FoCfAX5uqCDa4Ky9DFQRty514QJLQnVwWHJFTqa4P","shardMetadata":{"PipelineShardMetadata":{"modelMeta":{"modelId":"mlx-community/Qwen3-Coder-480B-A35B-Instruct-4bit","prettyName":"Qwen3 Coder 480B A35B (4-bit)","storageSize":{"inBytes":270088244224},"nLayers":62,"hiddenSize":6144,"supportsTensor":true},"deviceRank":0,"worldSize":1,"immediateException":false,"shouldTimeout":null,"startLayer":0,"endLayer":62,"nLayers":62}}}},{"DownloadPending":{"nodeId":"12D3KooWRm5FoCfAX5uqCDa4Ky9DFQRty514QJLQnVwWHJFTqa4P","shardMetadata":{"PipelineShardMetadata":{"modelMeta":{"modelId":"mlx-community/DeepSeek-V3.1-4bit","prettyName":"DeepSeek V3.1 (4-bit)","storageSize":{"inBytes":377606852608},"nLayers":61,"hiddenSize":7168,"supportsTensor":true},"deviceRank":0,"worldSize":1,"immediateException":false,"shouldTimeout":null,"startLayer":0,"endLayer":61,"nLayers":61}}}},{"DownloadPending":{"nodeId":"12D3KooWRm5FoCfAX5uqCDa4Ky9DFQRty514QJLQnVwWHJFTqa4P","shardMetadata":{"PipelineShardMetadata":{"modelMeta":{"modelId":"mlx-community/Qwen3-Coder-480B-A35B-Instruct-8bit","prettyName":"Qwen3 Coder 480B A35B (8-bit)","storageSize":{"inBytes":540171111424},"nLayers":62,"hiddenSize":6144,"supportsTensor":true},"deviceRank":0,"worldSize":1,"immediateException":false,"shouldTimeout":null,"startLayer":0,"endLayer":62,"nLayers":62}}}},{"DownloadPending":{"nodeId":"12D3KooWRm5FoCfAX5uqCDa4Ky9DFQRty514QJLQnVwWHJFTqa4P","shardMetadata":{"PipelineShardMetadata":{"modelMeta":{"modelId":"mlx-community/DeepSeek-V3.1-8bit","prettyName":"DeepSeek V3.1 (8-bit)","storageSize":{"inBytes":713066336256},"nLayers":61,"hiddenSize":7168,"supportsTensor":true},"deviceRank":0,"worldSize":1,"immediateException":false,"shouldTimeout":null,"startLayer":0,"endLayer":61,"nLayers":61}}}},{"DownloadCompleted":{"nodeId":"12D3KooWRm5FoCfAX5uqCDa4Ky9DFQRty514QJLQnVwWHJFTqa4P","shardMetadata":{"PipelineShardMetadata":{"modelMeta":{"modelId":"mlx-community/Kimi-K2-Thinking","prettyName":"Kimi K2 Thinking (4-bit)","storageSize":{"inBytes":657623228416},"nLayers":61,"hiddenSize":7168,"supportsTensor":true},"deviceRank":0,"worldSize":1,"immediateException":false,"shouldTimeout":null,"startLayer":0,"endLayer":61,"nLayers":61}}}},{"DownloadPending":{"nodeId":"12D3KooWRm5FoCfAX5uqCDa4Ky9DFQRty514QJLQnVwWHJFTqa4P","shardMetadata":{"PipelineShardMetadata":{"modelMeta":{"modelId":"mlx-community/Kimi-K2-Instruct-4bit","prettyName":"Kimi K2 Instruct (4-bit)","storageSize":{"inBytes":577593561088},"nLayers":61,"hiddenSize":7168,"supportsTensor":true},"deviceRank":0,"worldSize":1,"immediateException":false,"shouldTimeout":null,"startLayer":0,"endLayer":61,"nLayers":61}}}},{"DownloadCompleted":{"nodeId":"12D3KooWRm5FoCfAX5uqCDa4Ky9DFQRty514QJLQnVwWHJFTqa4P","shardMetadata":{"TensorShardMetadata":{"modelMeta":{"modelId":"mlx-community/Qwen3-30B-A3B-4bit","prettyName":"Qwen3 30B A3B (4-bit)","storageSize":{"inBytes":17612931072},"nLayers":48,"hiddenSize":2048,"supportsTensor":true},"deviceRank":1,"worldSize":2,"immediateException":false,"shouldTimeout":null,"startLayer":0,"endLayer":48,"nLayers":48}}}}],"12D3KooWNGQYtAMELqXPSbKs93UbVgiGh3ieBxXkayRhhST6ubL5":[{"DownloadPending":{"nodeId":"12D3KooWNGQYtAMELqXPSbKs93UbVgiGh3ieBxXkayRhhST6ubL5","shardMetadata":{"PipelineShardMetadata":{"modelMeta":{"modelId":"mlx-community/Llama-3.2-3B-Instruct-4bit","prettyName":"Llama 3.2 3B (4-bit)","storageSize":{"inBytes":1807423488},"nLayers":28,"hiddenSize":3072,"supportsTensor":true},"deviceRank":0,"worldSize":1,"immediateException":false,"shouldTimeout":null,"startLayer":0,"endLayer":28,"nLayers":28}}}},{"DownloadCompleted":{"nodeId":"12D3KooWNGQYtAMELqXPSbKs93UbVgiGh3ieBxXkayRhhST6ubL5","shardMetadata":{"PipelineShardMetadata":{"modelMeta":{"modelId":"mlx-community/Llama-3.2-1B-Instruct-4bit","prettyName":"Llama 3.2 1B (4-bit)","storageSize":{"inBytes":695242752},"nLayers":16,"hiddenSize":2048,"supportsTensor":true},"deviceRank":0,"worldSize":1,"immediateException":false,"shouldTimeout":null,"startLayer":0,"endLayer":16,"nLayers":16}}}},{"DownloadPending":{"nodeId":"12D3KooWNGQYtAMELqXPSbKs93UbVgiGh3ieBxXkayRhhST6ubL5","shardMetadata":{"PipelineShardMetadata":{"modelMeta":{"modelId":"mlx-community/Llama-3.2-3B-Instruct-8bit","prettyName":"Llama 3.2 3B (8-bit)","storageSize":{"inBytes":3413710848},"nLayers":28,"hiddenSize":3072,"supportsTensor":true},"deviceRank":0,"worldSize":1,"immediateException":false,"shouldTimeout":null,"startLayer":0,"endLayer":28,"nLayers":28}}}},{"DownloadPending":{"nodeId":"12D3KooWNGQYtAMELqXPSbKs93UbVgiGh3ieBxXkayRhhST6ubL5","shardMetadata":{"PipelineShardMetadata":{"modelMeta":{"modelId":"mlx-community/Meta-Llama-3.1-8B-Instruct-8bit","prettyName":"Llama 3.1 8B (8-bit)","storageSize":{"inBytes":8532402176},"nLayers":32,"hiddenSize":4096,"supportsTensor":true},"deviceRank":0,"worldSize":1,"immediateException":false,"shouldTimeout":null,"startLayer":0,"endLayer":32,"nLayers":32}}}},{"DownloadPending":{"nodeId":"12D3KooWNGQYtAMELqXPSbKs93UbVgiGh3ieBxXkayRhhST6ubL5","shardMetadata":{"PipelineShardMetadata":{"modelMeta":{"modelId":"mlx-community/Meta-Llama-3.1-8B-Instruct-4bit","prettyName":"Llama 3.1 8B (4-bit)","storageSize":{"inBytes":4517404672},"nLayers":32,"hiddenSize":4096,"supportsTensor":true},"deviceRank":0,"worldSize":1,"immediateException":false,"shouldTimeout":null,"startLayer":0,"endLayer":32,"nLayers":32}}}},{"DownloadPending":{"nodeId":"12D3KooWNGQYtAMELqXPSbKs93UbVgiGh3ieBxXkayRhhST6ubL5","shardMetadata":{"PipelineShardMetadata":{"modelMeta":{"modelId":"mlx-community/gpt-oss-20b-MXFP4-Q4","prettyName":"GPT-OSS 20B (MXFP4-Q4, MLX)","storageSize":{"inBytes":11178480768},"nLayers":24,"hiddenSize":2880,"supportsTensor":true},"deviceRank":0,"worldSize":1,"immediateException":false,"shouldTimeout":null,"startLayer":0,"endLayer":24,"nLayers":24}}}},{"DownloadPending":{"nodeId":"12D3KooWNGQYtAMELqXPSbKs93UbVgiGh3ieBxXkayRhhST6ubL5","shardMetadata":{"PipelineShardMetadata":{"modelMeta":{"modelId":"mlx-community/Qwen3-0.6B-8bit","prettyName":"Qwen3 0.6B (8-bit)","storageSize":{"inBytes":633364480},"nLayers":28,"hiddenSize":1024,"supportsTensor":false},"deviceRank":0,"worldSize":1,"immediateException":false,"shouldTimeout":null,"startLayer":0,"endLayer":28,"nLayers":28}}}},{"DownloadPending":{"nodeId":"12D3KooWNGQYtAMELqXPSbKs93UbVgiGh3ieBxXkayRhhST6ubL5","shardMetadata":{"PipelineShardMetadata":{"modelMeta":{"modelId":"mlx-community/Meta-Llama-3.1-8B-Instruct-bf16","prettyName":"Llama 3.1 8B (BF16)","storageSize":{"inBytes":16060522496},"nLayers":32,"hiddenSize":4096,"supportsTensor":true},"deviceRank":0,"worldSize":1,"immediateException":false,"shouldTimeout":null,"startLayer":0,"endLayer":32,"nLayers":32}}}},{"DownloadCompleted":{"nodeId":"12D3KooWNGQYtAMELqXPSbKs93UbVgiGh3ieBxXkayRhhST6ubL5","shardMetadata":{"PipelineShardMetadata":{"modelMeta":{"modelId":"mlx-community/Qwen3-0.6B-4bit","prettyName":"Qwen3 0.6B (4-bit)","storageSize":{"inBytes":335372288},"nLayers":28,"hiddenSize":1024,"supportsTensor":false},"deviceRank":0,"worldSize":1,"immediateException":false,"shouldTimeout":null,"startLayer":0,"endLayer":28,"nLayers":28}}}},{"DownloadCompleted":{"nodeId":"12D3KooWNGQYtAMELqXPSbKs93UbVgiGh3ieBxXkayRhhST6ubL5","shardMetadata":{"PipelineShardMetadata":{"modelMeta":{"modelId":"mlx-community/Qwen3-30B-A3B-4bit","prettyName":"Qwen3 30B A3B (4-bit)","storageSize":{"inBytes":17174622208},"nLayers":48,"hiddenSize":2048,"supportsTensor":true},"deviceRank":0,"worldSize":1,"immediateException":false,"shouldTimeout":null,"startLayer":0,"endLayer":48,"nLayers":48}}}},{"DownloadPending":{"nodeId":"12D3KooWNGQYtAMELqXPSbKs93UbVgiGh3ieBxXkayRhhST6ubL5","shardMetadata":{"PipelineShardMetadata":{"modelMeta":{"modelId":"mlx-community/Meta-Llama-3.1-70B-Instruct-4bit","prettyName":"Llama 3.1 70B (4-bit)","storageSize":{"inBytes":39688355840},"nLayers":80,"hiddenSize":8192,"supportsTensor":true},"deviceRank":0,"worldSize":1,"immediateException":false,"shouldTimeout":null,"startLayer":0,"endLayer":80,"nLayers":80}}}},{"DownloadPending":{"nodeId":"12D3KooWNGQYtAMELqXPSbKs93UbVgiGh3ieBxXkayRhhST6ubL5","shardMetadata":{"PipelineShardMetadata":{"modelMeta":{"modelId":"mlx-community/Llama-3.3-70B-Instruct-4bit","prettyName":"Llama 3.3 70B (4-bit)","storageSize":{"inBytes":39688355840},"nLayers":80,"hiddenSize":8192,"supportsTensor":true},"deviceRank":0,"worldSize":1,"immediateException":false,"shouldTimeout":null,"startLayer":0,"endLayer":80,"nLayers":80}}}},{"DownloadPending":{"nodeId":"12D3KooWNGQYtAMELqXPSbKs93UbVgiGh3ieBxXkayRhhST6ubL5","shardMetadata":{"PipelineShardMetadata":{"modelMeta":{"modelId":"mlx-community/Qwen3-30B-A3B-8bit","prettyName":"Qwen3 30B A3B (8-bit)","storageSize":{"inBytes":32440578048},"nLayers":48,"hiddenSize":2048,"supportsTensor":true},"deviceRank":0,"worldSize":1,"immediateException":false,"shouldTimeout":null,"startLayer":0,"endLayer":48,"nLayers":48}}}},{"DownloadPending":{"nodeId":"12D3KooWNGQYtAMELqXPSbKs93UbVgiGh3ieBxXkayRhhST6ubL5","shardMetadata":{"PipelineShardMetadata":{"modelMeta":{"modelId":"mlx-community/Qwen3-Next-80B-A3B-Instruct-4bit","prettyName":"Qwen3 80B A3B (4-bit)","storageSize":{"inBytes":44844060160},"nLayers":48,"hiddenSize":2048,"supportsTensor":true},"deviceRank":0,"worldSize":1,"immediateException":false,"shouldTimeout":null,"startLayer":0,"endLayer":48,"nLayers":48}}}},{"DownloadPending":{"nodeId":"12D3KooWNGQYtAMELqXPSbKs93UbVgiGh3ieBxXkayRhhST6ubL5","shardMetadata":{"PipelineShardMetadata":{"modelMeta":{"modelId":"mlx-community/gpt-oss-120b-MXFP4-Q8","prettyName":"GPT-OSS 120B (MXFP4-Q8, MLX)","storageSize":{"inBytes":63386815104},"nLayers":36,"hiddenSize":2880,"supportsTensor":true},"deviceRank":0,"worldSize":1,"immediateException":false,"shouldTimeout":null,"startLayer":0,"endLayer":36,"nLayers":36}}}},{"DownloadPending":{"nodeId":"12D3KooWNGQYtAMELqXPSbKs93UbVgiGh3ieBxXkayRhhST6ubL5","shardMetadata":{"PipelineShardMetadata":{"modelMeta":{"modelId":"mlx-community/Llama-3.3-70B-Instruct-8bit","prettyName":"Llama 3.3 70B (8-bit)","storageSize":{"inBytes":74964549632},"nLayers":80,"hiddenSize":8192,"supportsTensor":true},"deviceRank":0,"worldSize":1,"immediateException":false,"shouldTimeout":null,"startLayer":0,"endLayer":80,"nLayers":80}}}},{"DownloadPending":{"nodeId":"12D3KooWNGQYtAMELqXPSbKs93UbVgiGh3ieBxXkayRhhST6ubL5","shardMetadata":{"PipelineShardMetadata":{"modelMeta":{"modelId":"mlx-community/Qwen3-Next-80B-A3B-Thinking-4bit","prettyName":"Qwen3 80B A3B Thinking (4-bit)","storageSize":{"inBytes":44844060160},"nLayers":48,"hiddenSize":2048,"supportsTensor":true},"deviceRank":0,"worldSize":1,"immediateException":false,"shouldTimeout":null,"startLayer":0,"endLayer":48,"nLayers":48}}}},{"DownloadPending":{"nodeId":"12D3KooWNGQYtAMELqXPSbKs93UbVgiGh3ieBxXkayRhhST6ubL5","shardMetadata":{"PipelineShardMetadata":{"modelMeta":{"modelId":"mlx-community/Qwen3-Next-80B-A3B-Instruct-8bit","prettyName":"Qwen3 80B A3B (8-bit)","storageSize":{"inBytes":84655345152},"nLayers":48,"hiddenSize":2048,"supportsTensor":true},"deviceRank":0,"worldSize":1,"immediateException":false,"shouldTimeout":null,"startLayer":0,"endLayer":48,"nLayers":48}}}},{"DownloadPending":{"nodeId":"12D3KooWNGQYtAMELqXPSbKs93UbVgiGh3ieBxXkayRhhST6ubL5","shardMetadata":{"PipelineShardMetadata":{"modelMeta":{"modelId":"mlx-community/Qwen3-Next-80B-A3B-Thinking-8bit","prettyName":"Qwen3 80B A3B Thinking (8-bit)","storageSize":{"inBytes":84655345152},"nLayers":48,"hiddenSize":2048,"supportsTensor":true},"deviceRank":0,"worldSize":1,"immediateException":false,"shouldTimeout":null,"startLayer":0,"endLayer":48,"nLayers":48}}}},{"DownloadPending":{"nodeId":"12D3KooWNGQYtAMELqXPSbKs93UbVgiGh3ieBxXkayRhhST6ubL5","shardMetadata":{"PipelineShardMetadata":{"modelMeta":{"modelId":"mlx-community/GLM-4.5-Air-8bit","prettyName":"GLM 4.5 Air 8bit","storageSize":{"inBytes":113553627648},"nLayers":46,"hiddenSize":4096,"supportsTensor":false},"deviceRank":0,"worldSize":1,"immediateException":false,"shouldTimeout":null,"startLayer":0,"endLayer":46,"nLayers":46}}}},{"DownloadPending":{"nodeId":"12D3KooWNGQYtAMELqXPSbKs93UbVgiGh3ieBxXkayRhhST6ubL5","shardMetadata":{"PipelineShardMetadata":{"modelMeta":{"modelId":"mlx-community/llama-3.3-70b-instruct-fp16","prettyName":"Llama 3.3 70B (FP16)","storageSize":{"inBytes":141107412992},"nLayers":80,"hiddenSize":8192,"supportsTensor":true},"deviceRank":0,"worldSize":1,"immediateException":false,"shouldTimeout":null,"startLayer":0,"endLayer":80,"nLayers":80}}}},{"DownloadPending":{"nodeId":"12D3KooWNGQYtAMELqXPSbKs93UbVgiGh3ieBxXkayRhhST6ubL5","shardMetadata":{"PipelineShardMetadata":{"modelMeta":{"modelId":"mlx-community/Qwen3-235B-A22B-Instruct-2507-4bit","prettyName":"Qwen3 235B A22B (4-bit)","storageSize":{"inBytes":132241316864},"nLayers":94,"hiddenSize":4096,"supportsTensor":true},"deviceRank":0,"worldSize":1,"immediateException":false,"shouldTimeout":null,"startLayer":0,"endLayer":94,"nLayers":94}}}},{"DownloadPending":{"nodeId":"12D3KooWNGQYtAMELqXPSbKs93UbVgiGh3ieBxXkayRhhST6ubL5","shardMetadata":{"PipelineShardMetadata":{"modelMeta":{"modelId":"mlx-community/GLM-4.5-Air-bf16","prettyName":"GLM 4.5 Air bf16","storageSize":{"inBytes":213704514048},"nLayers":46,"hiddenSize":4096,"supportsTensor":true},"deviceRank":0,"worldSize":1,"immediateException":false,"shouldTimeout":null,"startLayer":0,"endLayer":46,"nLayers":46}}}},{"DownloadPending":{"nodeId":"12D3KooWNGQYtAMELqXPSbKs93UbVgiGh3ieBxXkayRhhST6ubL5","shardMetadata":{"PipelineShardMetadata":{"modelMeta":{"modelId":"mlx-community/Qwen3-235B-A22B-Instruct-2507-8bit","prettyName":"Qwen3 235B A22B (8-bit)","storageSize":{"inBytes":249787735040},"nLayers":94,"hiddenSize":4096,"supportsTensor":true},"deviceRank":0,"worldSize":1,"immediateException":false,"shouldTimeout":null,"startLayer":0,"endLayer":94,"nLayers":94}}}},{"DownloadPending":{"nodeId":"12D3KooWNGQYtAMELqXPSbKs93UbVgiGh3ieBxXkayRhhST6ubL5","shardMetadata":{"PipelineShardMetadata":{"modelMeta":{"modelId":"mlx-community/Qwen3-Coder-480B-A35B-Instruct-4bit","prettyName":"Qwen3 Coder 480B A35B (4-bit)","storageSize":{"inBytes":270088244224},"nLayers":62,"hiddenSize":6144,"supportsTensor":true},"deviceRank":0,"worldSize":1,"immediateException":false,"shouldTimeout":null,"startLayer":0,"endLayer":62,"nLayers":62}}}},{"DownloadPending":{"nodeId":"12D3KooWNGQYtAMELqXPSbKs93UbVgiGh3ieBxXkayRhhST6ubL5","shardMetadata":{"PipelineShardMetadata":{"modelMeta":{"modelId":"mlx-community/DeepSeek-V3.1-4bit","prettyName":"DeepSeek V3.1 (4-bit)","storageSize":{"inBytes":377606852608},"nLayers":61,"hiddenSize":7168,"supportsTensor":true},"deviceRank":0,"worldSize":1,"immediateException":false,"shouldTimeout":null,"startLayer":0,"endLayer":61,"nLayers":61}}}},{"DownloadPending":{"nodeId":"12D3KooWNGQYtAMELqXPSbKs93UbVgiGh3ieBxXkayRhhST6ubL5","shardMetadata":{"PipelineShardMetadata":{"modelMeta":{"modelId":"mlx-community/Qwen3-Coder-480B-A35B-Instruct-8bit","prettyName":"Qwen3 Coder 480B A35B (8-bit)","storageSize":{"inBytes":540171111424},"nLayers":62,"hiddenSize":6144,"supportsTensor":true},"deviceRank":0,"worldSize":1,"immediateException":false,"shouldTimeout":null,"startLayer":0,"endLayer":62,"nLayers":62}}}},{"DownloadPending":{"nodeId":"12D3KooWNGQYtAMELqXPSbKs93UbVgiGh3ieBxXkayRhhST6ubL5","shardMetadata":{"PipelineShardMetadata":{"modelMeta":{"modelId":"mlx-community/DeepSeek-V3.1-8bit","prettyName":"DeepSeek V3.1 (8-bit)","storageSize":{"inBytes":713066336256},"nLayers":61,"hiddenSize":7168,"supportsTensor":true},"deviceRank":0,"worldSize":1,"immediateException":false,"shouldTimeout":null,"startLayer":0,"endLayer":61,"nLayers":61}}}},{"DownloadPending":{"nodeId":"12D3KooWNGQYtAMELqXPSbKs93UbVgiGh3ieBxXkayRhhST6ubL5","shardMetadata":{"PipelineShardMetadata":{"modelMeta":{"modelId":"mlx-community/Kimi-K2-Instruct-4bit","prettyName":"Kimi K2 Instruct (4-bit)","storageSize":{"inBytes":577593561088},"nLayers":61,"hiddenSize":7168,"supportsTensor":true},"deviceRank":0,"worldSize":1,"immediateException":false,"shouldTimeout":null,"startLayer":0,"endLayer":61,"nLayers":61}}}},{"DownloadCompleted":{"nodeId":"12D3KooWNGQYtAMELqXPSbKs93UbVgiGh3ieBxXkayRhhST6ubL5","shardMetadata":{"PipelineShardMetadata":{"modelMeta":{"modelId":"mlx-community/Kimi-K2-Thinking","prettyName":"Kimi K2 Thinking (4-bit)","storageSize":{"inBytes":657623228416},"nLayers":61,"hiddenSize":7168,"supportsTensor":true},"deviceRank":0,"worldSize":1,"immediateException":false,"shouldTimeout":null,"startLayer":0,"endLayer":61,"nLayers":61}}}},{"DownloadCompleted":{"nodeId":"12D3KooWNGQYtAMELqXPSbKs93UbVgiGh3ieBxXkayRhhST6ubL5","shardMetadata":{"TensorShardMetadata":{"modelMeta":{"modelId":"mlx-community/Llama-3.2-1B-Instruct-4bit","prettyName":"Llama 3.2 1B (4-bit)","storageSize":{"inBytes":729808896},"nLayers":16,"hiddenSize":2048,"supportsTensor":true},"deviceRank":0,"worldSize":2,"immediateException":false,"shouldTimeout":null,"startLayer":0,"endLayer":16,"nLayers":16}}}},{"DownloadCompleted":{"nodeId":"12D3KooWNGQYtAMELqXPSbKs93UbVgiGh3ieBxXkayRhhST6ubL5","shardMetadata":{"TensorShardMetadata":{"modelMeta":{"modelId":"mlx-community/Qwen3-30B-A3B-4bit","prettyName":"Qwen3 30B A3B (4-bit)","storageSize":{"inBytes":17612931072},"nLayers":48,"hiddenSize":2048,"supportsTensor":true},"deviceRank":0,"worldSize":2,"immediateException":false,"shouldTimeout":null,"startLayer":0,"endLayer":48,"nLayers":48}}}}]},"tasks":{"0ba3f8cc-b492-451f-be6b-af9a743bfb6d":{"CreateRunner":{"taskId":"0ba3f8cc-b492-451f-be6b-af9a743bfb6d","taskStatus":"Complete","instanceId":"fc278494-6bba-4897-ba06-328b19b4c442","boundInstance":{"instance":{"MlxJacclInstance":{"instanceId":"fc278494-6bba-4897-ba06-328b19b4c442","shardAssignments":{"modelId":"mlx-community/Kimi-K2-Thinking","runnerToShard":{"f9fe81a2-6c59-4b4d-bab3-6777de53636f":{"TensorShardMetadata":{"modelMeta":{"modelId":"mlx-community/Kimi-K2-Thinking","prettyName":"Kimi K2 Thinking (4-bit)","storageSize":{"inBytes":706522120192},"nLayers":61,"hiddenSize":7168,"supportsTensor":true},"deviceRank":0,"worldSize":2,"immediateException":false,"shouldTimeout":null,"startLayer":0,"endLayer":61,"nLayers":61}},"614867ad-ff63-49e4-8cca-8667d98ca993":{"TensorShardMetadata":{"modelMeta":{"modelId":"mlx-community/Kimi-K2-Thinking","prettyName":"Kimi K2 Thinking (4-bit)","storageSize":{"inBytes":706522120192},"nLayers":61,"hiddenSize":7168,"supportsTensor":true},"deviceRank":1,"worldSize":2,"immediateException":false,"shouldTimeout":null,"startLayer":0,"endLayer":61,"nLayers":61}}},"nodeToRunner":{"12D3KooWNGQYtAMELqXPSbKs93UbVgiGh3ieBxXkayRhhST6ubL5":"f9fe81a2-6c59-4b4d-bab3-6777de53636f","12D3KooWRm5FoCfAX5uqCDa4Ky9DFQRty514QJLQnVwWHJFTqa4P":"614867ad-ff63-49e4-8cca-8667d98ca993"}},"ibvDevices":[[null,"rdma_en3"],["rdma_en3",null]],"jacclCoordinators":{"12D3KooWNGQYtAMELqXPSbKs93UbVgiGh3ieBxXkayRhhST6ubL5":"0.0.0.0:52414","12D3KooWRm5FoCfAX5uqCDa4Ky9DFQRty514QJLQnVwWHJFTqa4P":"10.173.5.227:52414"}}},"boundRunnerId":"614867ad-ff63-49e4-8cca-8667d98ca993","boundNodeId":"12D3KooWRm5FoCfAX5uqCDa4Ky9DFQRty514QJLQnVwWHJFTqa4P"}}},"d3e89e76-c522-473b-bdc2-e7868bebadb5":{"CreateRunner":{"taskId":"d3e89e76-c522-473b-bdc2-e7868bebadb5","taskStatus":"Complete","instanceId":"fc278494-6bba-4897-ba06-328b19b4c442","boundInstance":{"instance":{"MlxJacclInstance":{"instanceId":"fc278494-6bba-4897-ba06-328b19b4c442","shardAssignments":{"modelId":"mlx-community/Kimi-K2-Thinking","runnerToShard":{"f9fe81a2-6c59-4b4d-bab3-6777de53636f":{"TensorShardMetadata":{"modelMeta":{"modelId":"mlx-community/Kimi-K2-Thinking","prettyName":"Kimi K2 Thinking (4-bit)","storageSize":{"inBytes":706522120192},"nLayers":61,"hiddenSize":7168,"supportsTensor":true},"deviceRank":0,"worldSize":2,"immediateException":false,"shouldTimeout":null,"startLayer":0,"endLayer":61,"nLayers":61}},"614867ad-ff63-49e4-8cca-8667d98ca993":{"TensorShardMetadata":{"modelMeta":{"modelId":"mlx-community/Kimi-K2-Thinking","prettyName":"Kimi K2 Thinking (4-bit)","storageSize":{"inBytes":706522120192},"nLayers":61,"hiddenSize":7168,"supportsTensor":true},"deviceRank":1,"worldSize":2,"immediateException":false,"shouldTimeout":null,"startLayer":0,"endLayer":61,"nLayers":61}}},"nodeToRunner":{"12D3KooWNGQYtAMELqXPSbKs93UbVgiGh3ieBxXkayRhhST6ubL5":"f9fe81a2-6c59-4b4d-bab3-6777de53636f","12D3KooWRm5FoCfAX5uqCDa4Ky9DFQRty514QJLQnVwWHJFTqa4P":"614867ad-ff63-49e4-8cca-8667d98ca993"}},"ibvDevices":[[null,"rdma_en3"],["rdma_en3",null]],"jacclCoordinators":{"12D3KooWNGQYtAMELqXPSbKs93UbVgiGh3ieBxXkayRhhST6ubL5":"0.0.0.0:52414","12D3KooWRm5FoCfAX5uqCDa4Ky9DFQRty514QJLQnVwWHJFTqa4P":"10.173.5.227:52414"}}},"boundRunnerId":"f9fe81a2-6c59-4b4d-bab3-6777de53636f","boundNodeId":"12D3KooWNGQYtAMELqXPSbKs93UbVgiGh3ieBxXkayRhhST6ubL5"}}},"66247976-e3e2-4ca2-9c88-c19703340308":{"ConnectToGroup":{"taskId":"66247976-e3e2-4ca2-9c88-c19703340308","taskStatus":"Complete","instanceId":"fc278494-6bba-4897-ba06-328b19b4c442"}},"b960c5fd-351f-4e26-bee5-068807201753":{"ConnectToGroup":{"taskId":"b960c5fd-351f-4e26-bee5-068807201753","taskStatus":"Complete","instanceId":"fc278494-6bba-4897-ba06-328b19b4c442"}},"8db7427d-5d74-4201-8454-119a180b0017":{"LoadModel":{"taskId":"8db7427d-5d74-4201-8454-119a180b0017","taskStatus":"Complete","instanceId":"fc278494-6bba-4897-ba06-328b19b4c442"}},"0b305426-a2ab-4478-b6ae-bdcb617ba2be":{"LoadModel":{"taskId":"0b305426-a2ab-4478-b6ae-bdcb617ba2be","taskStatus":"Complete","instanceId":"fc278494-6bba-4897-ba06-328b19b4c442"}},"311af706-8d1b-403a-a5e8-655103281e40":{"StartWarmup":{"taskId":"311af706-8d1b-403a-a5e8-655103281e40","taskStatus":"Complete","instanceId":"fc278494-6bba-4897-ba06-328b19b4c442"}},"6660256e-da3e-42c1-a199-cd55d3f0ad1c":{"StartWarmup":{"taskId":"6660256e-da3e-42c1-a199-cd55d3f0ad1c","taskStatus":"Complete","instanceId":"fc278494-6bba-4897-ba06-328b19b4c442"}},"25ed46b9-5883-4717-9fc6-2de1fdf75fd5":{"CreateRunner":{"taskId":"25ed46b9-5883-4717-9fc6-2de1fdf75fd5","taskStatus":"Complete","instanceId":"5c41d169-0bd2-418e-af5b-55cbc9230cbb","boundInstance":{"instance":{"MlxJacclInstance":{"instanceId":"5c41d169-0bd2-418e-af5b-55cbc9230cbb","shardAssignments":{"modelId":"mlx-community/Llama-3.2-1B-Instruct-4bit","runnerToShard":{"9f72771c-f83a-4334-bd49-da66521cf7d2":{"TensorShardMetadata":{"modelMeta":{"modelId":"mlx-community/Llama-3.2-1B-Instruct-4bit","prettyName":"Llama 3.2 1B (4-bit)","storageSize":{"inBytes":729808896},"nLayers":16,"hiddenSize":2048,"supportsTensor":true},"deviceRank":0,"worldSize":2,"immediateException":false,"shouldTimeout":null,"startLayer":0,"endLayer":16,"nLayers":16}},"46b3c4f0-0d16-4d8e-af79-d438c7b93ab7":{"TensorShardMetadata":{"modelMeta":{"modelId":"mlx-community/Llama-3.2-1B-Instruct-4bit","prettyName":"Llama 3.2 1B (4-bit)","storageSize":{"inBytes":729808896},"nLayers":16,"hiddenSize":2048,"supportsTensor":true},"deviceRank":1,"worldSize":2,"immediateException":false,"shouldTimeout":null,"startLayer":0,"endLayer":16,"nLayers":16}}},"nodeToRunner":{"12D3KooWNGQYtAMELqXPSbKs93UbVgiGh3ieBxXkayRhhST6ubL5":"9f72771c-f83a-4334-bd49-da66521cf7d2","12D3KooWRm5FoCfAX5uqCDa4Ky9DFQRty514QJLQnVwWHJFTqa4P":"46b3c4f0-0d16-4d8e-af79-d438c7b93ab7"}},"ibvDevices":[[null,"rdma_en3"],["rdma_en3",null]],"jacclCoordinators":{"12D3KooWNGQYtAMELqXPSbKs93UbVgiGh3ieBxXkayRhhST6ubL5":"0.0.0.0:52414","12D3KooWRm5FoCfAX5uqCDa4Ky9DFQRty514QJLQnVwWHJFTqa4P":"10.173.5.227:52414"}}},"boundRunnerId":"9f72771c-f83a-4334-bd49-da66521cf7d2","boundNodeId":"12D3KooWNGQYtAMELqXPSbKs93UbVgiGh3ieBxXkayRhhST6ubL5"}}},"35205e47-8bf1-4a02-9b0f-55df34f4fbba":{"CreateRunner":{"taskId":"35205e47-8bf1-4a02-9b0f-55df34f4fbba","taskStatus":"Complete","instanceId":"5c41d169-0bd2-418e-af5b-55cbc9230cbb","boundInstance":{"instance":{"MlxJacclInstance":{"instanceId":"5c41d169-0bd2-418e-af5b-55cbc9230cbb","shardAssignments":{"modelId":"mlx-community/Llama-3.2-1B-Instruct-4bit","runnerToShard":{"9f72771c-f83a-4334-bd49-da66521cf7d2":{"TensorShardMetadata":{"modelMeta":{"modelId":"mlx-community/Llama-3.2-1B-Instruct-4bit","prettyName":"Llama 3.2 1B (4-bit)","storageSize":{"inBytes":729808896},"nLayers":16,"hiddenSize":2048,"supportsTensor":true},"deviceRank":0,"worldSize":2,"immediateException":false,"shouldTimeout":null,"startLayer":0,"endLayer":16,"nLayers":16}},"46b3c4f0-0d16-4d8e-af79-d438c7b93ab7":{"TensorShardMetadata":{"modelMeta":{"modelId":"mlx-community/Llama-3.2-1B-Instruct-4bit","prettyName":"Llama 3.2 1B (4-bit)","storageSize":{"inBytes":729808896},"nLayers":16,"hiddenSize":2048,"supportsTensor":true},"deviceRank":1,"worldSize":2,"immediateException":false,"shouldTimeout":null,"startLayer":0,"endLayer":16,"nLayers":16}}},"nodeToRunner":{"12D3KooWNGQYtAMELqXPSbKs93UbVgiGh3ieBxXkayRhhST6ubL5":"9f72771c-f83a-4334-bd49-da66521cf7d2","12D3KooWRm5FoCfAX5uqCDa4Ky9DFQRty514QJLQnVwWHJFTqa4P":"46b3c4f0-0d16-4d8e-af79-d438c7b93ab7"}},"ibvDevices":[[null,"rdma_en3"],["rdma_en3",null]],"jacclCoordinators":{"12D3KooWNGQYtAMELqXPSbKs93UbVgiGh3ieBxXkayRhhST6ubL5":"0.0.0.0:52414","12D3KooWRm5FoCfAX5uqCDa4Ky9DFQRty514QJLQnVwWHJFTqa4P":"10.173.5.227:52414"}}},"boundRunnerId":"46b3c4f0-0d16-4d8e-af79-d438c7b93ab7","boundNodeId":"12D3KooWRm5FoCfAX5uqCDa4Ky9DFQRty514QJLQnVwWHJFTqa4P"}}},"6346a87a-9e8a-49f8-a74f-196f7b97bb38":{"DownloadModel":{"taskId":"6346a87a-9e8a-49f8-a74f-196f7b97bb38","taskStatus":"Complete","instanceId":"5c41d169-0bd2-418e-af5b-55cbc9230cbb","shardMetadata":{"TensorShardMetadata":{"modelMeta":{"modelId":"mlx-community/Llama-3.2-1B-Instruct-4bit","prettyName":"Llama 3.2 1B (4-bit)","storageSize":{"inBytes":729808896},"nLayers":16,"hiddenSize":2048,"supportsTensor":true},"deviceRank":0,"worldSize":2,"immediateException":false,"shouldTimeout":null,"startLayer":0,"endLayer":16,"nLayers":16}}}},"dd0ddf00-fd0a-4e17-807d-e2a2f314f20e":{"ConnectToGroup":{"taskId":"dd0ddf00-fd0a-4e17-807d-e2a2f314f20e","taskStatus":"Complete","instanceId":"5c41d169-0bd2-418e-af5b-55cbc9230cbb"}},"58e48d16-ff8b-4d1d-8359-8a2476692787":{"ConnectToGroup":{"taskId":"58e48d16-ff8b-4d1d-8359-8a2476692787","taskStatus":"Complete","instanceId":"5c41d169-0bd2-418e-af5b-55cbc9230cbb"}},"792f3c85-248d-400b-afb9-6882bbe5ffbd":{"CreateRunner":{"taskId":"792f3c85-248d-400b-afb9-6882bbe5ffbd","taskStatus":"Complete","instanceId":"5d867b15-bf7a-4708-bcc4-8857229a7822","boundInstance":{"instance":{"MlxJacclInstance":{"instanceId":"5d867b15-bf7a-4708-bcc4-8857229a7822","shardAssignments":{"modelId":"mlx-community/Qwen3-30B-A3B-4bit","runnerToShard":{"f153a3c7-c2f4-43e4-9d5d-8102c1981fcf":{"TensorShardMetadata":{"modelMeta":{"modelId":"mlx-community/Qwen3-30B-A3B-4bit","prettyName":"Qwen3 30B A3B (4-bit)","storageSize":{"inBytes":17612931072},"nLayers":48,"hiddenSize":2048,"supportsTensor":true},"deviceRank":0,"worldSize":2,"immediateException":false,"shouldTimeout":null,"startLayer":0,"endLayer":48,"nLayers":48}},"cc42afcc-dafe-46c4-ac95-50b9e722652c":{"TensorShardMetadata":{"modelMeta":{"modelId":"mlx-community/Qwen3-30B-A3B-4bit","prettyName":"Qwen3 30B A3B (4-bit)","storageSize":{"inBytes":17612931072},"nLayers":48,"hiddenSize":2048,"supportsTensor":true},"deviceRank":1,"worldSize":2,"immediateException":false,"shouldTimeout":null,"startLayer":0,"endLayer":48,"nLayers":48}}},"nodeToRunner":{"12D3KooWNGQYtAMELqXPSbKs93UbVgiGh3ieBxXkayRhhST6ubL5":"f153a3c7-c2f4-43e4-9d5d-8102c1981fcf","12D3KooWRm5FoCfAX5uqCDa4Ky9DFQRty514QJLQnVwWHJFTqa4P":"cc42afcc-dafe-46c4-ac95-50b9e722652c"}},"ibvDevices":[[null,"rdma_en3"],["rdma_en3",null]],"jacclCoordinators":{"12D3KooWNGQYtAMELqXPSbKs93UbVgiGh3ieBxXkayRhhST6ubL5":"0.0.0.0:50768","12D3KooWRm5FoCfAX5uqCDa4Ky9DFQRty514QJLQnVwWHJFTqa4P":"10.173.5.227:50768"}}},"boundRunnerId":"f153a3c7-c2f4-43e4-9d5d-8102c1981fcf","boundNodeId":"12D3KooWNGQYtAMELqXPSbKs93UbVgiGh3ieBxXkayRhhST6ubL5"}}},"5e78b313-32d6-4d46-9859-93232e3b670b":{"CreateRunner":{"taskId":"5e78b313-32d6-4d46-9859-93232e3b670b","taskStatus":"Complete","instanceId":"5d867b15-bf7a-4708-bcc4-8857229a7822","boundInstance":{"instance":{"MlxJacclInstance":{"instanceId":"5d867b15-bf7a-4708-bcc4-8857229a7822","shardAssignments":{"modelId":"mlx-community/Qwen3-30B-A3B-4bit","runnerToShard":{"f153a3c7-c2f4-43e4-9d5d-8102c1981fcf":{"TensorShardMetadata":{"modelMeta":{"modelId":"mlx-community/Qwen3-30B-A3B-4bit","prettyName":"Qwen3 30B A3B (4-bit)","storageSize":{"inBytes":17612931072},"nLayers":48,"hiddenSize":2048,"supportsTensor":true},"deviceRank":0,"worldSize":2,"immediateException":false,"shouldTimeout":null,"startLayer":0,"endLayer":48,"nLayers":48}},"cc42afcc-dafe-46c4-ac95-50b9e722652c":{"TensorShardMetadata":{"modelMeta":{"modelId":"mlx-community/Qwen3-30B-A3B-4bit","prettyName":"Qwen3 30B A3B (4-bit)","storageSize":{"inBytes":17612931072},"nLayers":48,"hiddenSize":2048,"supportsTensor":true},"deviceRank":1,"worldSize":2,"immediateException":false,"shouldTimeout":null,"startLayer":0,"endLayer":48,"nLayers":48}}},"nodeToRunner":{"12D3KooWNGQYtAMELqXPSbKs93UbVgiGh3ieBxXkayRhhST6ubL5":"f153a3c7-c2f4-43e4-9d5d-8102c1981fcf","12D3KooWRm5FoCfAX5uqCDa4Ky9DFQRty514QJLQnVwWHJFTqa4P":"cc42afcc-dafe-46c4-ac95-50b9e722652c"}},"ibvDevices":[[null,"rdma_en3"],["rdma_en3",null]],"jacclCoordinators":{"12D3KooWNGQYtAMELqXPSbKs93UbVgiGh3ieBxXkayRhhST6ubL5":"0.0.0.0:50768","12D3KooWRm5FoCfAX5uqCDa4Ky9DFQRty514QJLQnVwWHJFTqa4P":"10.173.5.227:50768"}}},"boundRunnerId":"cc42afcc-dafe-46c4-ac95-50b9e722652c","boundNodeId":"12D3KooWRm5FoCfAX5uqCDa4Ky9DFQRty514QJLQnVwWHJFTqa4P"}}},"f5cf062e-f3aa-4512-a2be-c8f7735005d0":{"DownloadModel":{"taskId":"f5cf062e-f3aa-4512-a2be-c8f7735005d0","taskStatus":"Complete","instanceId":"5d867b15-bf7a-4708-bcc4-8857229a7822","shardMetadata":{"TensorShardMetadata":{"modelMeta":{"modelId":"mlx-community/Qwen3-30B-A3B-4bit","prettyName":"Qwen3 30B A3B (4-bit)","storageSize":{"inBytes":17612931072},"nLayers":48,"hiddenSize":2048,"supportsTensor":true},"deviceRank":0,"worldSize":2,"immediateException":false,"shouldTimeout":null,"startLayer":0,"endLayer":48,"nLayers":48}}}},"a827252e-2db4-4a76-bfd0-3b68b4a4d483":{"DownloadModel":{"taskId":"a827252e-2db4-4a76-bfd0-3b68b4a4d483","taskStatus":"Complete","instanceId":"5d867b15-bf7a-4708-bcc4-8857229a7822","shardMetadata":{"TensorShardMetadata":{"modelMeta":{"modelId":"mlx-community/Qwen3-30B-A3B-4bit","prettyName":"Qwen3 30B A3B (4-bit)","storageSize":{"inBytes":17612931072},"nLayers":48,"hiddenSize":2048,"supportsTensor":true},"deviceRank":1,"worldSize":2,"immediateException":false,"shouldTimeout":null,"startLayer":0,"endLayer":48,"nLayers":48}}}},"7d45ef3a-b2ec-495c-8d16-353ab60c3539":{"LoadModel":{"taskId":"7d45ef3a-b2ec-495c-8d16-353ab60c3539","taskStatus":"Complete","instanceId":"5c41d169-0bd2-418e-af5b-55cbc9230cbb"}},"0ac9a4dc-ef71-45cd-8530-2dd81d570dc5":{"LoadModel":{"taskId":"0ac9a4dc-ef71-45cd-8530-2dd81d570dc5","taskStatus":"Complete","instanceId":"5c41d169-0bd2-418e-af5b-55cbc9230cbb"}},"c23e2120-8feb-4d32-a787-da374437c59e":{"ConnectToGroup":{"taskId":"c23e2120-8feb-4d32-a787-da374437c59e","taskStatus":"Complete","instanceId":"5d867b15-bf7a-4708-bcc4-8857229a7822"}},"961e9168-c8ee-4554-8538-4d0af51aa3e9":{"ConnectToGroup":{"taskId":"961e9168-c8ee-4554-8538-4d0af51aa3e9","taskStatus":"Complete","instanceId":"5d867b15-bf7a-4708-bcc4-8857229a7822"}},"45fa5506-39bf-49c8-aac5-f4501a642284":{"StartWarmup":{"taskId":"45fa5506-39bf-49c8-aac5-f4501a642284","taskStatus":"Complete","instanceId":"5c41d169-0bd2-418e-af5b-55cbc9230cbb"}},"4aa76ac2-10ac-452c-b138-d42f72ba12f3":{"StartWarmup":{"taskId":"4aa76ac2-10ac-452c-b138-d42f72ba12f3","taskStatus":"Complete","instanceId":"5c41d169-0bd2-418e-af5b-55cbc9230cbb"}},"9f61cd96-9639-4609-b72f-07382c583570":{"LoadModel":{"taskId":"9f61cd96-9639-4609-b72f-07382c583570","taskStatus":"Running","instanceId":"5d867b15-bf7a-4708-bcc4-8857229a7822"}},"47368c33-19bc-4d55-9a89-d3f799ea8a55":{"LoadModel":{"taskId":"47368c33-19bc-4d55-9a89-d3f799ea8a55","taskStatus":"Running","instanceId":"5d867b15-bf7a-4708-bcc4-8857229a7822"}},"d40fb4d3-2940-4ee7-a0f9-8388c81479ad":{"CreateRunner":{"taskId":"d40fb4d3-2940-4ee7-a0f9-8388c81479ad","taskStatus":"Complete","instanceId":"cd836205-99fb-4047-b63c-92b7928fc9b6","boundInstance":{"instance":{"MlxRingInstance":{"instanceId":"cd836205-99fb-4047-b63c-92b7928fc9b6","shardAssignments":{"modelId":"mlx-community/Qwen3-30B-A3B-4bit","runnerToShard":{"86a00a02-a097-4f42-92a4-ebb4aabfa782":{"TensorShardMetadata":{"modelMeta":{"modelId":"mlx-community/Qwen3-30B-A3B-4bit","prettyName":"Qwen3 30B A3B (4-bit)","storageSize":{"inBytes":17612931072},"nLayers":48,"hiddenSize":2048,"supportsTensor":true},"deviceRank":0,"worldSize":2,"immediateException":false,"shouldTimeout":null,"startLayer":0,"endLayer":48,"nLayers":48}},"19acb176-9402-4178-97ec-42dabb766cff":{"TensorShardMetadata":{"modelMeta":{"modelId":"mlx-community/Qwen3-30B-A3B-4bit","prettyName":"Qwen3 30B A3B (4-bit)","storageSize":{"inBytes":17612931072},"nLayers":48,"hiddenSize":2048,"supportsTensor":true},"deviceRank":1,"worldSize":2,"immediateException":false,"shouldTimeout":null,"startLayer":0,"endLayer":48,"nLayers":48}}},"nodeToRunner":{"12D3KooWNGQYtAMELqXPSbKs93UbVgiGh3ieBxXkayRhhST6ubL5":"86a00a02-a097-4f42-92a4-ebb4aabfa782","12D3KooWRm5FoCfAX5uqCDa4Ky9DFQRty514QJLQnVwWHJFTqa4P":"19acb176-9402-4178-97ec-42dabb766cff"}},"hostsByNode":{"12D3KooWNGQYtAMELqXPSbKs93UbVgiGh3ieBxXkayRhhST6ubL5":[{"ip":"0.0.0.0","port":52414},{"ip":"10.173.5.228","port":52414}],"12D3KooWRm5FoCfAX5uqCDa4Ky9DFQRty514QJLQnVwWHJFTqa4P":[{"ip":"10.173.5.227","port":52414},{"ip":"0.0.0.0","port":52414}]},"ephemeralPort":52414}},"boundRunnerId":"19acb176-9402-4178-97ec-42dabb766cff","boundNodeId":"12D3KooWRm5FoCfAX5uqCDa4Ky9DFQRty514QJLQnVwWHJFTqa4P"}}},"a7018320-0cd3-4206-aa3a-fef2bd4c6fc7":{"CreateRunner":{"taskId":"a7018320-0cd3-4206-aa3a-fef2bd4c6fc7","taskStatus":"Complete","instanceId":"cd836205-99fb-4047-b63c-92b7928fc9b6","boundInstance":{"instance":{"MlxRingInstance":{"instanceId":"cd836205-99fb-4047-b63c-92b7928fc9b6","shardAssignments":{"modelId":"mlx-community/Qwen3-30B-A3B-4bit","runnerToShard":{"86a00a02-a097-4f42-92a4-ebb4aabfa782":{"TensorShardMetadata":{"modelMeta":{"modelId":"mlx-community/Qwen3-30B-A3B-4bit","prettyName":"Qwen3 30B A3B (4-bit)","storageSize":{"inBytes":17612931072},"nLayers":48,"hiddenSize":2048,"supportsTensor":true},"deviceRank":0,"worldSize":2,"immediateException":false,"shouldTimeout":null,"startLayer":0,"endLayer":48,"nLayers":48}},"19acb176-9402-4178-97ec-42dabb766cff":{"TensorShardMetadata":{"modelMeta":{"modelId":"mlx-community/Qwen3-30B-A3B-4bit","prettyName":"Qwen3 30B A3B (4-bit)","storageSize":{"inBytes":17612931072},"nLayers":48,"hiddenSize":2048,"supportsTensor":true},"deviceRank":1,"worldSize":2,"immediateException":false,"shouldTimeout":null,"startLayer":0,"endLayer":48,"nLayers":48}}},"nodeToRunner":{"12D3KooWNGQYtAMELqXPSbKs93UbVgiGh3ieBxXkayRhhST6ubL5":"86a00a02-a097-4f42-92a4-ebb4aabfa782","12D3KooWRm5FoCfAX5uqCDa4Ky9DFQRty514QJLQnVwWHJFTqa4P":"19acb176-9402-4178-97ec-42dabb766cff"}},"hostsByNode":{"12D3KooWNGQYtAMELqXPSbKs93UbVgiGh3ieBxXkayRhhST6ubL5":[{"ip":"0.0.0.0","port":52414},{"ip":"10.173.5.228","port":52414}],"12D3KooWRm5FoCfAX5uqCDa4Ky9DFQRty514QJLQnVwWHJFTqa4P":[{"ip":"10.173.5.227","port":52414},{"ip":"0.0.0.0","port":52414}]},"ephemeralPort":52414}},"boundRunnerId":"86a00a02-a097-4f42-92a4-ebb4aabfa782","boundNodeId":"12D3KooWNGQYtAMELqXPSbKs93UbVgiGh3ieBxXkayRhhST6ubL5"}}},"e16e2a82-7275-4e89-9f13-de8b5252221d":{"ConnectToGroup":{"taskId":"e16e2a82-7275-4e89-9f13-de8b5252221d","taskStatus":"Complete","instanceId":"cd836205-99fb-4047-b63c-92b7928fc9b6"}},"85540c11-f5f3-4504-be26-355b902eda41":{"ConnectToGroup":{"taskId":"85540c11-f5f3-4504-be26-355b902eda41","taskStatus":"Complete","instanceId":"cd836205-99fb-4047-b63c-92b7928fc9b6"}},"a3b62e76-1c92-49b9-99c9-c9511ef363c6":{"LoadModel":{"taskId":"a3b62e76-1c92-49b9-99c9-c9511ef363c6","taskStatus":"Complete","instanceId":"cd836205-99fb-4047-b63c-92b7928fc9b6"}},"6ad831b7-a0c9-4c45-a9b1-5e58f2b90ee1":{"LoadModel":{"taskId":"6ad831b7-a0c9-4c45-a9b1-5e58f2b90ee1","taskStatus":"Complete","instanceId":"cd836205-99fb-4047-b63c-92b7928fc9b6"}},"8506eaff-d927-47bb-9e6a-21d2721e590a":{"StartWarmup":{"taskId":"8506eaff-d927-47bb-9e6a-21d2721e590a","taskStatus":"Complete","instanceId":"cd836205-99fb-4047-b63c-92b7928fc9b6"}},"0c170c5a-a525-46df-98f2-47d0c9868b7a":{"StartWarmup":{"taskId":"0c170c5a-a525-46df-98f2-47d0c9868b7a","taskStatus":"Complete","instanceId":"cd836205-99fb-4047-b63c-92b7928fc9b6"}},"b45a20ba-6771-4e09-a18a-960656507111":{"CreateRunner":{"taskId":"b45a20ba-6771-4e09-a18a-960656507111","taskStatus":"Complete","instanceId":"d6726dd5-1cb1-415e-bfd7-d8734ded8af7","boundInstance":{"instance":{"MlxJacclInstance":{"instanceId":"d6726dd5-1cb1-415e-bfd7-d8734ded8af7","shardAssignments":{"modelId":"mlx-community/Qwen3-30B-A3B-4bit","runnerToShard":{"ddf82dda-6061-41e8-9749-1c749365abd5":{"PipelineShardMetadata":{"modelMeta":{"modelId":"mlx-community/Qwen3-30B-A3B-4bit","prettyName":"Qwen3 30B A3B (4-bit)","storageSize":{"inBytes":17612931072},"nLayers":48,"hiddenSize":2048,"supportsTensor":true},"deviceRank":0,"worldSize":2,"immediateException":false,"shouldTimeout":null,"startLayer":0,"endLayer":24,"nLayers":48}},"dbfe86a9-bc21-4ff6-abf8-aa6c03ad4466":{"PipelineShardMetadata":{"modelMeta":{"modelId":"mlx-community/Qwen3-30B-A3B-4bit","prettyName":"Qwen3 30B A3B (4-bit)","storageSize":{"inBytes":17612931072},"nLayers":48,"hiddenSize":2048,"supportsTensor":true},"deviceRank":1,"worldSize":2,"immediateException":false,"shouldTimeout":null,"startLayer":24,"endLayer":48,"nLayers":48}}},"nodeToRunner":{"12D3KooWNGQYtAMELqXPSbKs93UbVgiGh3ieBxXkayRhhST6ubL5":"ddf82dda-6061-41e8-9749-1c749365abd5","12D3KooWRm5FoCfAX5uqCDa4Ky9DFQRty514QJLQnVwWHJFTqa4P":"dbfe86a9-bc21-4ff6-abf8-aa6c03ad4466"}},"ibvDevices":[[null,"rdma_en3"],["rdma_en3",null]],"jacclCoordinators":{"12D3KooWNGQYtAMELqXPSbKs93UbVgiGh3ieBxXkayRhhST6ubL5":"0.0.0.0:52414","12D3KooWRm5FoCfAX5uqCDa4Ky9DFQRty514QJLQnVwWHJFTqa4P":"10.173.5.227:52414"}}},"boundRunnerId":"dbfe86a9-bc21-4ff6-abf8-aa6c03ad4466","boundNodeId":"12D3KooWRm5FoCfAX5uqCDa4Ky9DFQRty514QJLQnVwWHJFTqa4P"}}},"704d09a4-c1fb-4c11-8880-cfb8ef16f91b":{"CreateRunner":{"taskId":"704d09a4-c1fb-4c11-8880-cfb8ef16f91b","taskStatus":"Complete","instanceId":"d6726dd5-1cb1-415e-bfd7-d8734ded8af7","boundInstance":{"instance":{"MlxJacclInstance":{"instanceId":"d6726dd5-1cb1-415e-bfd7-d8734ded8af7","shardAssignments":{"modelId":"mlx-community/Qwen3-30B-A3B-4bit","runnerToShard":{"ddf82dda-6061-41e8-9749-1c749365abd5":{"PipelineShardMetadata":{"modelMeta":{"modelId":"mlx-community/Qwen3-30B-A3B-4bit","prettyName":"Qwen3 30B A3B (4-bit)","storageSize":{"inBytes":17612931072},"nLayers":48,"hiddenSize":2048,"supportsTensor":true},"deviceRank":0,"worldSize":2,"immediateException":false,"shouldTimeout":null,"startLayer":0,"endLayer":24,"nLayers":48}},"dbfe86a9-bc21-4ff6-abf8-aa6c03ad4466":{"PipelineShardMetadata":{"modelMeta":{"modelId":"mlx-community/Qwen3-30B-A3B-4bit","prettyName":"Qwen3 30B A3B (4-bit)","storageSize":{"inBytes":17612931072},"nLayers":48,"hiddenSize":2048,"supportsTensor":true},"deviceRank":1,"worldSize":2,"immediateException":false,"shouldTimeout":null,"startLayer":24,"endLayer":48,"nLayers":48}}},"nodeToRunner":{"12D3KooWNGQYtAMELqXPSbKs93UbVgiGh3ieBxXkayRhhST6ubL5":"ddf82dda-6061-41e8-9749-1c749365abd5","12D3KooWRm5FoCfAX5uqCDa4Ky9DFQRty514QJLQnVwWHJFTqa4P":"dbfe86a9-bc21-4ff6-abf8-aa6c03ad4466"}},"ibvDevices":[[null,"rdma_en3"],["rdma_en3",null]],"jacclCoordinators":{"12D3KooWNGQYtAMELqXPSbKs93UbVgiGh3ieBxXkayRhhST6ubL5":"0.0.0.0:52414","12D3KooWRm5FoCfAX5uqCDa4Ky9DFQRty514QJLQnVwWHJFTqa4P":"10.173.5.227:52414"}}},"boundRunnerId":"ddf82dda-6061-41e8-9749-1c749365abd5","boundNodeId":"12D3KooWNGQYtAMELqXPSbKs93UbVgiGh3ieBxXkayRhhST6ubL5"}}},"60996326-9de8-4b81-977e-50ab1ac8483e":{"ConnectToGroup":{"taskId":"60996326-9de8-4b81-977e-50ab1ac8483e","taskStatus":"Complete","instanceId":"d6726dd5-1cb1-415e-bfd7-d8734ded8af7"}},"bddd4b33-b867-4717-bb88-4bf761812db1":{"ConnectToGroup":{"taskId":"bddd4b33-b867-4717-bb88-4bf761812db1","taskStatus":"Complete","instanceId":"d6726dd5-1cb1-415e-bfd7-d8734ded8af7"}},"03b27dac-8398-439b-9a1c-eadd9a286219":{"LoadModel":{"taskId":"03b27dac-8398-439b-9a1c-eadd9a286219","taskStatus":"Complete","instanceId":"d6726dd5-1cb1-415e-bfd7-d8734ded8af7"}},"81fbf712-65d1-4ba0-baf2-9ba30aa882ff":{"LoadModel":{"taskId":"81fbf712-65d1-4ba0-baf2-9ba30aa882ff","taskStatus":"Complete","instanceId":"d6726dd5-1cb1-415e-bfd7-d8734ded8af7"}},"de3dc97a-454e-4e6f-9dbb-f0aeabe8ea91":{"StartWarmup":{"taskId":"de3dc97a-454e-4e6f-9dbb-f0aeabe8ea91","taskStatus":"Running","instanceId":"d6726dd5-1cb1-415e-bfd7-d8734ded8af7"}},"fa82423e-1fa8-455d-94b4-32c874e3303f":{"StartWarmup":{"taskId":"fa82423e-1fa8-455d-94b4-32c874e3303f","taskStatus":"Running","instanceId":"d6726dd5-1cb1-415e-bfd7-d8734ded8af7"}},"9d251388-4b1d-4c87-9e2f-202618027599":{"CreateRunner":{"taskId":"9d251388-4b1d-4c87-9e2f-202618027599","taskStatus":"Complete","instanceId":"2ecfe6cf-a57b-464c-8d4c-8d66d6c6b22d","boundInstance":{"instance":{"MlxJacclInstance":{"instanceId":"2ecfe6cf-a57b-464c-8d4c-8d66d6c6b22d","shardAssignments":{"modelId":"mlx-community/Qwen3-30B-A3B-4bit","runnerToShard":{"116aa6aa-724b-4c3e-95c4-fe2d7054baf3":{"TensorShardMetadata":{"modelMeta":{"modelId":"mlx-community/Qwen3-30B-A3B-4bit","prettyName":"Qwen3 30B A3B (4-bit)","storageSize":{"inBytes":17612931072},"nLayers":48,"hiddenSize":2048,"supportsTensor":true},"deviceRank":0,"worldSize":2,"immediateException":false,"shouldTimeout":null,"startLayer":0,"endLayer":48,"nLayers":48}},"1ca7f6ed-54aa-48d3-a3a0-cb10993c7204":{"TensorShardMetadata":{"modelMeta":{"modelId":"mlx-community/Qwen3-30B-A3B-4bit","prettyName":"Qwen3 30B A3B (4-bit)","storageSize":{"inBytes":17612931072},"nLayers":48,"hiddenSize":2048,"supportsTensor":true},"deviceRank":1,"worldSize":2,"immediateException":false,"shouldTimeout":null,"startLayer":0,"endLayer":48,"nLayers":48}}},"nodeToRunner":{"12D3KooWNGQYtAMELqXPSbKs93UbVgiGh3ieBxXkayRhhST6ubL5":"116aa6aa-724b-4c3e-95c4-fe2d7054baf3","12D3KooWRm5FoCfAX5uqCDa4Ky9DFQRty514QJLQnVwWHJFTqa4P":"1ca7f6ed-54aa-48d3-a3a0-cb10993c7204"}},"ibvDevices":[[null,"rdma_en3"],["rdma_en3",null]],"jacclCoordinators":{"12D3KooWNGQYtAMELqXPSbKs93UbVgiGh3ieBxXkayRhhST6ubL5":"0.0.0.0:52414","12D3KooWRm5FoCfAX5uqCDa4Ky9DFQRty514QJLQnVwWHJFTqa4P":"10.173.5.227:52414"}}},"boundRunnerId":"116aa6aa-724b-4c3e-95c4-fe2d7054baf3","boundNodeId":"12D3KooWNGQYtAMELqXPSbKs93UbVgiGh3ieBxXkayRhhST6ubL5"}}},"ff5e8174-1f35-4134-b383-90a96b5a714a":{"CreateRunner":{"taskId":"ff5e8174-1f35-4134-b383-90a96b5a714a","taskStatus":"Complete","instanceId":"2ecfe6cf-a57b-464c-8d4c-8d66d6c6b22d","boundInstance":{"instance":{"MlxJacclInstance":{"instanceId":"2ecfe6cf-a57b-464c-8d4c-8d66d6c6b22d","shardAssignments":{"modelId":"mlx-community/Qwen3-30B-A3B-4bit","runnerToShard":{"116aa6aa-724b-4c3e-95c4-fe2d7054baf3":{"TensorShardMetadata":{"modelMeta":{"modelId":"mlx-community/Qwen3-30B-A3B-4bit","prettyName":"Qwen3 30B A3B (4-bit)","storageSize":{"inBytes":17612931072},"nLayers":48,"hiddenSize":2048,"supportsTensor":true},"deviceRank":0,"worldSize":2,"immediateException":false,"shouldTimeout":null,"startLayer":0,"endLayer":48,"nLayers":48}},"1ca7f6ed-54aa-48d3-a3a0-cb10993c7204":{"TensorShardMetadata":{"modelMeta":{"modelId":"mlx-community/Qwen3-30B-A3B-4bit","prettyName":"Qwen3 30B A3B (4-bit)","storageSize":{"inBytes":17612931072},"nLayers":48,"hiddenSize":2048,"supportsTensor":true},"deviceRank":1,"worldSize":2,"immediateException":false,"shouldTimeout":null,"startLayer":0,"endLayer":48,"nLayers":48}}},"nodeToRunner":{"12D3KooWNGQYtAMELqXPSbKs93UbVgiGh3ieBxXkayRhhST6ubL5":"116aa6aa-724b-4c3e-95c4-fe2d7054baf3","12D3KooWRm5FoCfAX5uqCDa4Ky9DFQRty514QJLQnVwWHJFTqa4P":"1ca7f6ed-54aa-48d3-a3a0-cb10993c7204"}},"ibvDevices":[[null,"rdma_en3"],["rdma_en3",null]],"jacclCoordinators":{"12D3KooWNGQYtAMELqXPSbKs93UbVgiGh3ieBxXkayRhhST6ubL5":"0.0.0.0:52414","12D3KooWRm5FoCfAX5uqCDa4Ky9DFQRty514QJLQnVwWHJFTqa4P":"10.173.5.227:52414"}}},"boundRunnerId":"1ca7f6ed-54aa-48d3-a3a0-cb10993c7204","boundNodeId":"12D3KooWRm5FoCfAX5uqCDa4Ky9DFQRty514QJLQnVwWHJFTqa4P"}}},"14f679f8-6276-456a-a771-6f9c0bc3e0f4":{"ConnectToGroup":{"taskId":"14f679f8-6276-456a-a771-6f9c0bc3e0f4","taskStatus":"Complete","instanceId":"2ecfe6cf-a57b-464c-8d4c-8d66d6c6b22d"}},"4908d07c-d29e-4e41-b6cb-e69e756a7f4e":{"ConnectToGroup":{"taskId":"4908d07c-d29e-4e41-b6cb-e69e756a7f4e","taskStatus":"Complete","instanceId":"2ecfe6cf-a57b-464c-8d4c-8d66d6c6b22d"}},"f27efb0e-cff7-4af8-b295-4acb51f436eb":{"LoadModel":{"taskId":"f27efb0e-cff7-4af8-b295-4acb51f436eb","taskStatus":"Running","instanceId":"2ecfe6cf-a57b-464c-8d4c-8d66d6c6b22d"}},"0b3afc8e-f701-4e82-9f61-bbdabe5a4b70":{"LoadModel":{"taskId":"0b3afc8e-f701-4e82-9f61-bbdabe5a4b70","taskStatus":"Running","instanceId":"2ecfe6cf-a57b-464c-8d4c-8d66d6c6b22d"}},"0b52142a-4c59-4003-a1bb-edfcc85b46bc":{"Shutdown":{"taskId":"0b52142a-4c59-4003-a1bb-edfcc85b46bc","taskStatus":"TimedOut","instanceId":"d6726dd5-1cb1-415e-bfd7-d8734ded8af7","runnerId":"dbfe86a9-bc21-4ff6-abf8-aa6c03ad4466"}},"25bb16e4-4087-46ad-b00e-f8f144a34958":{"Shutdown":{"taskId":"25bb16e4-4087-46ad-b00e-f8f144a34958","taskStatus":"TimedOut","instanceId":"d6726dd5-1cb1-415e-bfd7-d8734ded8af7","runnerId":"ddf82dda-6061-41e8-9749-1c749365abd5"}},"a4dcabf1-e194-4c10-a3d7-871d21102c82":{"Shutdown":{"taskId":"a4dcabf1-e194-4c10-a3d7-871d21102c82","taskStatus":"TimedOut","instanceId":"2ecfe6cf-a57b-464c-8d4c-8d66d6c6b22d","runnerId":"1ca7f6ed-54aa-48d3-a3a0-cb10993c7204"}},"87a32e3b-9584-480b-b64d-04dc72fc8ba4":{"Shutdown":{"taskId":"87a32e3b-9584-480b-b64d-04dc72fc8ba4","taskStatus":"TimedOut","instanceId":"2ecfe6cf-a57b-464c-8d4c-8d66d6c6b22d","runnerId":"116aa6aa-724b-4c3e-95c4-fe2d7054baf3"}},"5fffec18-7b1e-4227-a0e6-80d69a5e946f":{"Shutdown":{"taskId":"5fffec18-7b1e-4227-a0e6-80d69a5e946f","taskStatus":"TimedOut","instanceId":"5d867b15-bf7a-4708-bcc4-8857229a7822","runnerId":"cc42afcc-dafe-46c4-ac95-50b9e722652c"}},"2b81939b-d24f-410a-9cfb-4d0c54cc6226":{"Shutdown":{"taskId":"2b81939b-d24f-410a-9cfb-4d0c54cc6226","taskStatus":"TimedOut","instanceId":"5d867b15-bf7a-4708-bcc4-8857229a7822","runnerId":"f153a3c7-c2f4-43e4-9d5d-8102c1981fcf"}},"2026cc43-de6b-4e39-a7e3-279abc40276e":{"CreateRunner":{"taskId":"2026cc43-de6b-4e39-a7e3-279abc40276e","taskStatus":"Complete","instanceId":"4512fd12-60a7-4b22-b8e4-9c27dedd4ed3","boundInstance":{"instance":{"MlxRingInstance":{"instanceId":"4512fd12-60a7-4b22-b8e4-9c27dedd4ed3","shardAssignments":{"modelId":"mlx-community/Qwen3-30B-A3B-4bit","runnerToShard":{"63630962-f375-454a-bf4d-ecd4980400f4":{"PipelineShardMetadata":{"modelMeta":{"modelId":"mlx-community/Qwen3-30B-A3B-4bit","prettyName":"Qwen3 30B A3B (4-bit)","storageSize":{"inBytes":17612931072},"nLayers":48,"hiddenSize":2048,"supportsTensor":true},"deviceRank":0,"worldSize":2,"immediateException":false,"shouldTimeout":null,"startLayer":0,"endLayer":24,"nLayers":48}},"16c2f841-ec62-451b-9272-30b9448651e4":{"PipelineShardMetadata":{"modelMeta":{"modelId":"mlx-community/Qwen3-30B-A3B-4bit","prettyName":"Qwen3 30B A3B (4-bit)","storageSize":{"inBytes":17612931072},"nLayers":48,"hiddenSize":2048,"supportsTensor":true},"deviceRank":1,"worldSize":2,"immediateException":false,"shouldTimeout":null,"startLayer":24,"endLayer":48,"nLayers":48}}},"nodeToRunner":{"12D3KooWNGQYtAMELqXPSbKs93UbVgiGh3ieBxXkayRhhST6ubL5":"63630962-f375-454a-bf4d-ecd4980400f4","12D3KooWRm5FoCfAX5uqCDa4Ky9DFQRty514QJLQnVwWHJFTqa4P":"16c2f841-ec62-451b-9272-30b9448651e4"}},"hostsByNode":{"12D3KooWNGQYtAMELqXPSbKs93UbVgiGh3ieBxXkayRhhST6ubL5":[{"ip":"0.0.0.0","port":52414},{"ip":"10.173.5.228","port":52414}],"12D3KooWRm5FoCfAX5uqCDa4Ky9DFQRty514QJLQnVwWHJFTqa4P":[{"ip":"10.173.5.227","port":52414},{"ip":"0.0.0.0","port":52414}]},"ephemeralPort":52414}},"boundRunnerId":"63630962-f375-454a-bf4d-ecd4980400f4","boundNodeId":"12D3KooWNGQYtAMELqXPSbKs93UbVgiGh3ieBxXkayRhhST6ubL5"}}},"6112d6f1-33a8-44c3-89f4-b28733eaf50f":{"CreateRunner":{"taskId":"6112d6f1-33a8-44c3-89f4-b28733eaf50f","taskStatus":"Complete","instanceId":"4512fd12-60a7-4b22-b8e4-9c27dedd4ed3","boundInstance":{"instance":{"MlxRingInstance":{"instanceId":"4512fd12-60a7-4b22-b8e4-9c27dedd4ed3","shardAssignments":{"modelId":"mlx-community/Qwen3-30B-A3B-4bit","runnerToShard":{"63630962-f375-454a-bf4d-ecd4980400f4":{"PipelineShardMetadata":{"modelMeta":{"modelId":"mlx-community/Qwen3-30B-A3B-4bit","prettyName":"Qwen3 30B A3B (4-bit)","storageSize":{"inBytes":17612931072},"nLayers":48,"hiddenSize":2048,"supportsTensor":true},"deviceRank":0,"worldSize":2,"immediateException":false,"shouldTimeout":null,"startLayer":0,"endLayer":24,"nLayers":48}},"16c2f841-ec62-451b-9272-30b9448651e4":{"PipelineShardMetadata":{"modelMeta":{"modelId":"mlx-community/Qwen3-30B-A3B-4bit","prettyName":"Qwen3 30B A3B (4-bit)","storageSize":{"inBytes":17612931072},"nLayers":48,"hiddenSize":2048,"supportsTensor":true},"deviceRank":1,"worldSize":2,"immediateException":false,"shouldTimeout":null,"startLayer":24,"endLayer":48,"nLayers":48}}},"nodeToRunner":{"12D3KooWNGQYtAMELqXPSbKs93UbVgiGh3ieBxXkayRhhST6ubL5":"63630962-f375-454a-bf4d-ecd4980400f4","12D3KooWRm5FoCfAX5uqCDa4Ky9DFQRty514QJLQnVwWHJFTqa4P":"16c2f841-ec62-451b-9272-30b9448651e4"}},"hostsByNode":{"12D3KooWNGQYtAMELqXPSbKs93UbVgiGh3ieBxXkayRhhST6ubL5":[{"ip":"0.0.0.0","port":52414},{"ip":"10.173.5.228","port":52414}],"12D3KooWRm5FoCfAX5uqCDa4Ky9DFQRty514QJLQnVwWHJFTqa4P":[{"ip":"10.173.5.227","port":52414},{"ip":"0.0.0.0","port":52414}]},"ephemeralPort":52414}},"boundRunnerId":"16c2f841-ec62-451b-9272-30b9448651e4","boundNodeId":"12D3KooWRm5FoCfAX5uqCDa4Ky9DFQRty514QJLQnVwWHJFTqa4P"}}},"07e9e9d9-7203-4ad2-9b54-26e731bb112f":{"ConnectToGroup":{"taskId":"07e9e9d9-7203-4ad2-9b54-26e731bb112f","taskStatus":"Complete","instanceId":"4512fd12-60a7-4b22-b8e4-9c27dedd4ed3"}},"d73f8919-df1b-46c9-bee7-e6caae1f3f09":{"ConnectToGroup":{"taskId":"d73f8919-df1b-46c9-bee7-e6caae1f3f09","taskStatus":"Complete","instanceId":"4512fd12-60a7-4b22-b8e4-9c27dedd4ed3"}},"2e47225b-b681-4296-9ce2-c970ac1d167a":{"LoadModel":{"taskId":"2e47225b-b681-4296-9ce2-c970ac1d167a","taskStatus":"Complete","instanceId":"4512fd12-60a7-4b22-b8e4-9c27dedd4ed3"}},"253a3095-5936-42ca-b39f-985c44077a14":{"LoadModel":{"taskId":"253a3095-5936-42ca-b39f-985c44077a14","taskStatus":"Complete","instanceId":"4512fd12-60a7-4b22-b8e4-9c27dedd4ed3"}},"b535dfe9-cb26-43df-a217-8366bf4ee877":{"StartWarmup":{"taskId":"b535dfe9-cb26-43df-a217-8366bf4ee877","taskStatus":"Complete","instanceId":"4512fd12-60a7-4b22-b8e4-9c27dedd4ed3"}},"b92bb532-7622-4d60-913a-c49e7174f54b":{"StartWarmup":{"taskId":"b92bb532-7622-4d60-913a-c49e7174f54b","taskStatus":"Complete","instanceId":"4512fd12-60a7-4b22-b8e4-9c27dedd4ed3"}},"7d4520f3-fedf-468e-b190-e34bf6036433":{"CreateRunner":{"taskId":"7d4520f3-fedf-468e-b190-e34bf6036433","taskStatus":"Complete","instanceId":"d375ad54-505c-4ae4-a68f-29edd6ac725e","boundInstance":{"instance":{"MlxRingInstance":{"instanceId":"d375ad54-505c-4ae4-a68f-29edd6ac725e","shardAssignments":{"modelId":"mlx-community/Qwen3-30B-A3B-4bit","runnerToShard":{"d9e97a79-8e17-4d3b-9d33-4910077fd3fd":{"PipelineShardMetadata":{"modelMeta":{"modelId":"mlx-community/Qwen3-30B-A3B-4bit","prettyName":"Qwen3 30B A3B (4-bit)","storageSize":{"inBytes":17612931072},"nLayers":48,"hiddenSize":2048,"supportsTensor":true},"deviceRank":0,"worldSize":2,"immediateException":false,"shouldTimeout":null,"startLayer":0,"endLayer":24,"nLayers":48}},"5cf0e4a5-afea-4ea4-8845-ebc154bc9018":{"PipelineShardMetadata":{"modelMeta":{"modelId":"mlx-community/Qwen3-30B-A3B-4bit","prettyName":"Qwen3 30B A3B (4-bit)","storageSize":{"inBytes":17612931072},"nLayers":48,"hiddenSize":2048,"supportsTensor":true},"deviceRank":1,"worldSize":2,"immediateException":false,"shouldTimeout":null,"startLayer":24,"endLayer":48,"nLayers":48}}},"nodeToRunner":{"12D3KooWNGQYtAMELqXPSbKs93UbVgiGh3ieBxXkayRhhST6ubL5":"d9e97a79-8e17-4d3b-9d33-4910077fd3fd","12D3KooWRm5FoCfAX5uqCDa4Ky9DFQRty514QJLQnVwWHJFTqa4P":"5cf0e4a5-afea-4ea4-8845-ebc154bc9018"}},"hostsByNode":{"12D3KooWNGQYtAMELqXPSbKs93UbVgiGh3ieBxXkayRhhST6ubL5":[{"ip":"0.0.0.0","port":52414},{"ip":"10.173.5.228","port":52414}],"12D3KooWRm5FoCfAX5uqCDa4Ky9DFQRty514QJLQnVwWHJFTqa4P":[{"ip":"100.88.54.38","port":52414},{"ip":"0.0.0.0","port":52414}]},"ephemeralPort":52414}},"boundRunnerId":"d9e97a79-8e17-4d3b-9d33-4910077fd3fd","boundNodeId":"12D3KooWNGQYtAMELqXPSbKs93UbVgiGh3ieBxXkayRhhST6ubL5"}}}},"nodeProfiles":{},"lastSeen":{},"topology":{"nodes":[],"connections":[]},"lastEventAppliedIdx":9262}
Hi @AlexCheema, I think this issue might be related to file descriptor limits. Can you please run ulimit -a and check your current limit? If it's set to 256, try increasing it to a higher value (e.g., ulimit -n 1024) and see if that resolves the issue.
Hi @AlexCheema, I think this issue might be related to file descriptor limits. Can you please run
ulimit -aand check your current limit? If it's set to 256, try increasing it to a higher value (e.g., ulimit -n 1024) and see if that resolves the issue.
Likely yes. We should fix this in exo though not just an ad-hoc bash command.