Issue with loading weights
I am trying to use two MacBooks to run a Llama 8B model, but I can't load weights from the model to inference and stuck in 0 progress...
Here's info of my equipments: node1: MacBook Air 16GB with M3 chip node2: MacBook Pro 16GB with M1 chip(Intel based)
Since my machine resource is limited, all of my machines run in tinygrad inference engine rather than in MLX. And I also wonder why my MacBook Pro shows 0TFLOPS?
I will be so appreciative if someone can offer me help~
Can you try running with SUPPORT_BF16=0
Thanks for the reply! I assume that SUPPORT_BF16=0 means smaller quantized weights accuracy, right?
I gave it a shot, but it doesn't work. Maybe the problem stems from the different architecture of chips?
Thanks for the reply! I assume that
SUPPORT_BF16=0means smaller quantized weights accuracy, right? I gave it a shot, but it doesn't work. Maybe the problem stems from the different architecture of chips?
No no this should definitely work.
Can you run with DEBUG=6. Are there any errors?
Thanks!!! DEBUG flag helps a lot! But I wanna know different DEBUG flags stand for what? lol