dolly
dolly copied to clipboard
Running Dolly (predict/inference) on a Mac / CPU
Hi,
- Why using Dolly on a Mac is EXTREMELY slow? Isn't it just calling, conceptually, a "predict" func? Do you have to have GPU for reasonable runtime?
- What would be the best way if someone wants to develop a langchain chain and use Dolly within it, but wants also to be able to debug the code?
Thanks, Yuval
These models are far too large to run reasonably on CPUs, yes. You need an NVIDIA GPU, so this won't work on Macs. See https://github.com/databrickslabs/dolly/issues/67 for some attempts to get to run via MPS, but doesn't sound like it works.
Dolly is not code, it's a model. You can use langchain with it; there are examples in this repo. Debugging Python code, you can do, but just isn't related to this model. You can set verbose=True on langchain chains, however, to get them to show you what they are doing with an LLM, which often helps 'debug' in that sense.