litgpt
litgpt copied to clipboard
LitGPT chat terminates weirdly
Our litgpt chat
command is doing something weird where it exits the shell and executes the code on the terminal.
To reproduce:
litgpt chat --checkpoint_dir out/custom-phi-2/final
and then paste the following into the chat window:
explain this code: # 1) Download a pretrained model
litgpt download --repo_id microsoft/phi-2
# 2) Pretrain the model
litgpt pretrain \
--initial_checkpoint_dir checkpoints/microsoft/phi-2 \
--data Alpaca2k \
--out_dir out/custom-phi-2
# 3) Chat with the model
litgpt chat \
--checkpoint_dir out/custom-phi-2/final
This will first work correctly but then exit the session and run the code on the shell, e.g. search for >> Prompt: >> Reply
in the output below:
⚡ codegemma ~/litgpt litgpt chat --checkpoint_dir checkpoints/google/codegemma-7b-it
Now chatting with CodeGemma-7b-it.
To exit, press 'Enter' on an empty prompt.
Seed set to 1234
>> Prompt: explain this code: # 1) Download a pretrained model
litgpt download --repo_id microsoft/phi-2
# 2) Pretrain the model
litgpt pretrain \
--initial_checkpoint_dir checkpoints/microsoft/phi-2 \
--data Alpaca2k \
--out_dir out/custom-phi-2
# 3) Chat with the model
litgpt chat \
--checkpoint_dir out/custom-phi-2/final
>> Reply: The code snippet you provided attempts to download a pre-trained model from a given URL. Here's a step-by-step explanation:
1. **Import the necessary library:** The code starts by importing the `requests` library, which is needed to make HTTP requests to the URL.
2. **Define the URL:** The URL where the pre-trained model is located is defined as a string.
3. **Download the model:** The `requests.get()` method is used to send a GET request to the URL. This request retrieves the binary data of the pre-trained model.
4. **Save the model:** The `with open()` statement is used to open a file named `model.ckpt` in write binary mode (`wb`). The retrieved model data is then written to this file using the file object's `write()` method.
5. **Close the file:** The file object is closed using the `close()` method to save the changes made to the file.
In summary, this code downloads a pre-trained model from the specified URL and saves it locally as a file named `model.ckpt`. This can be useful for using the model in further applications or training additional models.
Time for inference: 13.50 sec total, 18.52 tokens/sec, 250 tokens
>> Prompt: >> Reply: I can't download the LITGPT model because it is not publicly available. It is a private repository on the Hugging Face model hub. For security reasons, I can't access private repositories without the owner's permission.
Time for inference: 2.47 sec total, 19.00 tokens/sec, 47 tokens
>> Prompt: %
⚡ codegemma ~/litgpt # 2) Pretrain the model
⚡ codegemma ~/litgpt litgpt pretrain \
> --initial_checkpoint_dir checkpoints/microsoft/phi-2 \
> --data Alpaca2k \
> --out_dir out/custom-phi-2
Traceback (most recent call last):
File "/home/zeus/miniconda3/envs/cloudspace/bin/litgpt", line 8, in <module>
sys.exit(main())
File "/teamspace/studios/this_studio/litgpt/litgpt/__main__.py", line 131, in main
fn(**kwargs)
File "/teamspace/studios/this_studio/litgpt/litgpt/pretrain.py", line 96, in setup
raise ValueError(f"Please specify --model_name <model_name>. Available values:\n{available_models}")
ValueError: Please specify --model_name <model_name>. Available values:
Camel-Platypus2-13B
Camel-Platypus2-70B
CodeGemma-7b-it
CodeLlama-13b-Instruct-hf
CodeLlama-13b-Python-hf
CodeLlama-13b-hf
CodeLlama-34b-Instruct-hf
CodeLlama-34b-Python-hf
CodeLlama-34b-hf
CodeLlama-70b-Instruct-hf
CodeLlama-70b-Python-hf
CodeLlama-70b-hf
CodeLlama-7b-Instruct-hf
CodeLlama-7b-Python-hf
CodeLlama-7b-hf
FreeWilly2
Gemma-2b
Gemma-2b-it
Gemma-7b
Gemma-7b-it
LLaMA-2-7B-32K
Llama-2-13b-chat-hf
Llama-2-13b-hf
Llama-2-70b-chat-hf
Llama-2-70b-hf
Llama-2-7b-chat-hf
Llama-2-7b-chat-hf-function-calling-v2
Llama-2-7b-hf
Mistral-7B-Instruct-v0.1
Mistral-7B-Instruct-v0.2
Mistral-7B-v0.1
Mistral-7B-v0.2
Mixtral-8x7B-Instruct-v0.1
Mixtral-8x7B-v0.1
Nous-Hermes-13b
Nous-Hermes-Llama2-13b
Nous-Hermes-llama-2-7b
Platypus-30B
Platypus2-13B
Platypus2-70B
Platypus2-70B-instruct
Platypus2-7B
RedPajama-INCITE-7B-Base
RedPajama-INCITE-7B-Chat
RedPajama-INCITE-7B-Instruct
RedPajama-INCITE-Base-3B-v1
RedPajama-INCITE-Base-7B-v0.1
RedPajama-INCITE-Chat-3B-v1
RedPajama-INCITE-Chat-7B-v0.1
RedPajama-INCITE-Instruct-3B-v1
RedPajama-INCITE-Instruct-7B-v0.1
Stable-Platypus2-13B
dolly-v2-12b
dolly-v2-3b
dolly-v2-7b
falcon-180B
falcon-180B-chat
falcon-40b
falcon-40b-instruct
falcon-7b
falcon-7b-instruct
longchat-13b-16k
longchat-7b-16k
open_llama_13b
open_llama_3b
open_llama_7b
phi-1_5
phi-2
pythia-1.4b
pythia-1.4b-deduped
pythia-12b
pythia-12b-deduped
pythia-14m
pythia-160m
pythia-160m-deduped
pythia-1b
pythia-1b-deduped
pythia-2.8b
pythia-2.8b-deduped
pythia-31m
pythia-410m
pythia-410m-deduped
pythia-6.9b
pythia-6.9b-deduped
pythia-70m
pythia-70m-deduped
stable-code-3b
stablecode-completion-alpha-3b
stablecode-completion-alpha-3b-4k
stablecode-instruct-alpha-3b
stablelm-3b-4e1t
stablelm-base-alpha-3b
stablelm-base-alpha-7b
stablelm-tuned-alpha-3b
stablelm-tuned-alpha-7b
stablelm-zephyr-3b
tiny-llama-1.1b
tiny-llama-1.1b-chat
vicuna-13b-v1.3
vicuna-13b-v1.5
vicuna-13b-v1.5-16k
vicuna-33b-v1.3
vicuna-7b-v1.3
vicuna-7b-v1.5
vicuna-7b-v1.5-16k
⚡ codegemma ~/litgpt
⚡ codegemma ~/litgpt # 3) Chat with the model
⚡ codegemma ~/litgpt litgpt chat \
> --checkpoint_dir out/custom-phi-2/final
--checkpoint_dir '/teamspace/studios/this_studio/litgpt/out/custom-phi-2/final' is not a checkpoint directory.
Find download instructions at https://github.com/Lightning-AI/litgpt/blob/main/tutorials
You have downloaded locally:
--checkpoint_dir '/teamspace/studios/this_studio/litgpt/checkpoints/google/codegemma-7b-it'
See all download options by running:
I recommend to validate it on a local machine. Terminal in a Studio might behave weirdly.
Good point, this is probably it. I'll test it on a GPU machine later
Unfortunately, I have that issue in a local terminal too.
The script is designed to stop when you pass an empty line: https://github.com/Lightning-AI/litgpt/blob/main/litgpt/chat/base.py#L173-L174.
I suggest that you debug what input()
gets when you copy paste a piece of text like that with empty newlines. It might split it into pieces.
Hm yeah this is indeed what's happening. There's currently no way around it if we want to use input()
. Even if we fixed that by e.g., not quitting on empty lines the problem would still be that the model would process each line independently I think.
We probably need to switch to sys.stdin.read()
or so.
Maybe the easiest is simply to remove the functionality to terminate on an empty line. This should also prevent accidents if you copy paste a litgpt chat
command + a new line, which would lead to immediate termination. Asking the user to type exit just like in the Python shell is IMO very acceptable.
I agree. That's what I had first but then I thought you guys wouldn't be happy to remove it entirely ... but I agree, it's the simplest solution.
Is that enough? I thought this would also need a way to let it know that you finished typing to support newlines