gpt4all icon indicating copy to clipboard operation
gpt4all copied to clipboard

Huggingface Automodel compliant LLAMA model

Open PerpSeal opened this issue 1 year ago • 3 comments

Hey guys its a bit unclear what expected here : m = GPT4AllGPU(LLAMA_PATH)

Where do I get that? Which kind of file is it ?

PerpSeal avatar Apr 08 '23 04:04 PerpSeal

Something from here: https://huggingface.co/models

Aunxfb avatar Apr 09 '23 03:04 Aunxfb

I don't have this working yet either but to try to be more helpful, i think you specifically need something from here: https://huggingface.co/models?other=llama

edit: I made a little progress:

git lfs install git clone <huggingface.co link> (such as .../zpn/llama-7b)

check the config.json and tokenizer_config.json for improper capitalization. (https://github.com/huggingface/transformers/issues/22222#issuecomment-1477171703) "Change the LLaMATokenizer in tokenizer_config.json into lowercase LlamaTokenizer and it works like a charm."

Slowly-Grokking avatar Apr 10 '23 04:04 Slowly-Grokking

The thread: https://github.com/nomic-ai/gpt4all/issues/159#issue-1650481558 suggests using the model https://huggingface.co/decapoda-research/llama-7b-hf

Thanks to @Slowly-Grokking, it worked for me. After setting the path in the nomic/gpt4allGPU.py file to where the cloned huggingface repository resides e.g. m = GPT4AllGPU('/home/user/gpt4all/llama-7b-hf') (make sure to enclose the path within quotation marks) and changing "tokenizer_class": "LLaMATokenizer" to "tokenizer_class": "LlamaTokenizer" in the llama-7b-hf/tokenizer_config.json file, it starts to work in a technical sense.

Note that using an LLaMA model from Huggingface (which is Hugging Face Automodel compliant and therefore GPU acceleratable by gpt4all) means that you are no longer using the original assistant-style fine-tuned, quantized LLM LoRa.

keodarus avatar Apr 10 '23 08:04 keodarus

Stale, please open a new issue if this is still relevant.

niansa avatar Aug 11 '23 11:08 niansa