litgpt chat/base.py: extend checkpoint

Hi there 👋

Inside auto_download_checkpoint the code also extends the path by prepending checkpoints/ if it's not provided.

Since currently it's done after check_file_size_on_cpu_and_warn, the command

litgpt chat google/gemma-2-9b-it

will fail.

Jul 12 '24 10:07 Andrei-Aksionov

The reason why auto-download is so far down there (compared to the other places) is that I had to move it below the LoRA merging, because otherwise it will download a model if you want to use LoRA weights.

Jul 12 '24 15:07 rasbt

I also didn't notice the missing CPU warning because I was running it on GPU 😅. The following reorg might work ...

Jul 12 '24 16:07 rasbt

I think this should be good now. Feel free to merge if you agree

Jul 12 '24 18:07 rasbt

There is a problem. checkpoint_path is created before auto_download_checkpoint is called, which means that the checpoint_dir wasn't extended at the moment of creation of checkpoint_path. So when we do load_checkpoint(fabric, model, checkpoint_path) it will throw an error (if the path doesn't contain checkpoints/). This needs to be fixed and the test should be updated.

Jul 12 '24 19:07 Andrei-Aksionov

Are we good to merge? What do you think?

Jul 13 '24 13:07 Andrei-Aksionov

Yes, I think so. But one last question, after the update, have you double-checked / tested it on custom paths that don't start with `"checkpoints", like

litgpt chat my_custom_dir/google/gemma-2-9b-it

Jul 13 '24 13:07 rasbt

have you double-checked / tested it on custom paths that don't start with `"checkpoints"

No 😊. Only just now checked with a Pythia model.

fix_chat_checkpoint_dir_extension ~/lit-gpt litgpt chat custom_dir/$repo_id
{'access_token': None,
 'checkpoint_dir': PosixPath('custom_dir/EleutherAI/pythia-70m'),
 'compile': False,
 'multiline': False,
 'precision': None,
 'quantize': None,
 'temperature': 0.8,
 'top_k': 200,
 'top_p': 1.0}
Now chatting with pythia-70m.
To exit, press 'Enter' on an empty prompt.

Seed set to 1234
>> Prompt: Hello
>> Reply: *
Time for inference: 0.28 sec total, 3.52 tokens/sec, 1 tokens

Jul 13 '24 13:07 Andrei-Aksionov

Nice, thanks for checking! Looks all good to me now :)

Jul 13 '24 14:07 rasbt

chat/base.py: extend checkpoint_dir before accessing it