llm.c How to do Inference on the trained weight of GPT 2 model after finishing the training on CPU using train_gpt2.py and train

How to do Inference on the trained weight of GPT 2 model after finishing the training on CPU using train_gpt2.py and train_gpt2 ?

Open asifshaikat opened this issue 9 months ago • 1 comments

Hi Thank you very much for making everything so understandable for even a noob like me. Sorry for such silly question though I followed the instructions in the repository's README to train a language model on a Bengali text dataset with around 35,169 tokens, using your laptop's CPU (without a GPU). I modified the train_gpt2.py script to set my own starting words in Bengali instead of the default "<|endoftext|>". Now, I want to know how to check if the trained model weights (the result of the training process) have improved the model's capabilities compared to before the training. I would like to compare the model's predictions with the actual text in the test dataset. thank you for your time .

May 06 '24 15:05 asifshaikat

there is a file in the fodler /llm.c/dev/eval/export_hf.py which covnerts to safetensors. for example. python /home/myles/llm.c/dev/eval/export_hf.py --input /home/myles/llm.c/log124M/model_00019560.bin --output converted_model

And then you can do something like this in python: import torch from transformers import AutoTokenizer, AutoModelForCausalLM

def generate_text(prompt, max_length=1000): tokenizer = AutoTokenizer.from_pretrained("/home/myles/llm.c/converted_model") model = AutoModelForCausalLM.from_pretrained("/home/myles/llm.c/converted_model", torch_dtype=torch.bfloat16 if torch.cuda.is_available() else torch.float32) model.to('cuda' if torch.cuda.is_available() else 'cpu')

inputs = tokenizer(prompt, return_tensors="pt").to('cuda' if torch.cuda.is_available() else 'cpu')
attention_mask = inputs["attention_mask"]
outputs = model.generate(
    inputs["input_ids"], 
    attention_mask=attention_mask, 
    max_length=max_length, 
    do_sample=True, 
    top_p=0.95, 
    top_k=50, 
    pad_token_id=tokenizer.eos_token_id
)
return tokenizer.decode(outputs[0], skip_special_tokens=True)

if name == "main": print("Chatbot: Hi! How can I help you today?") while True: user_input = input("You: ") if user_input.lower() in ["exit", "quit", "stop"]: print("Chatbot: Goodbye!") break prompt = f"User: {user_input}\nChatbot:" response = generate_text(prompt) print(f"Chatbot: {response}")

Jul 31 '24 02:07 mylesgoose

llm.c llm.c copied to clipboard

How to do Inference on the trained weight of GPT 2 model after finishing the training on CPU using train_gpt2.py and train_gpt2 ?

llm.c
llm.c copied to clipboard