How run 30B on 4 GPUs interactively

Open zhongtao93 opened this issue 2 years ago • 1 comments

It's work on predefined prompts, how to change it to chat mode like chatgpt, I use:

if local_rank == 0:
    prompts = input("User:")

It doesn`t work.

Mar 10 '23 07:03 zhongtao93

I make a fork that supports interactive job across multiple machines: https://github.com/LambdaLabsML/llama/blob/b8cb25d01d0563bba12f265649079061f3ed753e/interactive.py#L122-L137

Mar 14 '23 15:03 chuanli11

Closing as we released Llama 2 chat. Feel free to re-open as needed.

Sep 06 '23 17:09 WuhanMonkey