dolly icon indicating copy to clipboard operation
dolly copied to clipboard

Running Dolly (predict/inference) on a Mac / CPU

Open nardi-yuval opened this issue 2 years ago • 1 comments

Hi,

  1. Why using Dolly on a Mac is EXTREMELY slow? Isn't it just calling, conceptually, a "predict" func? Do you have to have GPU for reasonable runtime?
  2. What would be the best way if someone wants to develop a langchain chain and use Dolly within it, but wants also to be able to debug the code?

Thanks, Yuval

nardi-yuval avatar Apr 28 '23 19:04 nardi-yuval

These models are far too large to run reasonably on CPUs, yes. You need an NVIDIA GPU, so this won't work on Macs. See https://github.com/databrickslabs/dolly/issues/67 for some attempts to get to run via MPS, but doesn't sound like it works.

Dolly is not code, it's a model. You can use langchain with it; there are examples in this repo. Debugging Python code, you can do, but just isn't related to this model. You can set verbose=True on langchain chains, however, to get them to show you what they are doing with an LLM, which often helps 'debug' in that sense.

srowen avatar Apr 29 '23 14:04 srowen