llm.c
llm.c copied to clipboard
How to do Inference on the trained weight of GPT 2 model after finishing the training on CPU using train_gpt2.py and train_gpt2 ?
Hi Thank you very much for making everything so understandable for even a noob like me. Sorry for such silly question though I followed the instructions in the repository's README to train a language model on a Bengali text dataset with around 35,169 tokens, using your laptop's CPU (without a GPU). I modified the train_gpt2.py script to set my own starting words in Bengali instead of the default "<|endoftext|>". Now, I want to know how to check if the trained model weights (the result of the training process) have improved the model's capabilities compared to before the training. I would like to compare the model's predictions with the actual text in the test dataset. thank you for your time .
there is a file in the fodler /llm.c/dev/eval/export_hf.py which covnerts to safetensors. for example. python /home/myles/llm.c/dev/eval/export_hf.py --input /home/myles/llm.c/log124M/model_00019560.bin --output converted_model
And then you can do something like this in python: import torch from transformers import AutoTokenizer, AutoModelForCausalLM
def generate_text(prompt, max_length=1000): tokenizer = AutoTokenizer.from_pretrained("/home/myles/llm.c/converted_model") model = AutoModelForCausalLM.from_pretrained("/home/myles/llm.c/converted_model", torch_dtype=torch.bfloat16 if torch.cuda.is_available() else torch.float32) model.to('cuda' if torch.cuda.is_available() else 'cpu')
inputs = tokenizer(prompt, return_tensors="pt").to('cuda' if torch.cuda.is_available() else 'cpu')
attention_mask = inputs["attention_mask"]
outputs = model.generate(
inputs["input_ids"],
attention_mask=attention_mask,
max_length=max_length,
do_sample=True,
top_p=0.95,
top_k=50,
pad_token_id=tokenizer.eos_token_id
)
return tokenizer.decode(outputs[0], skip_special_tokens=True)
if name == "main": print("Chatbot: Hi! How can I help you today?") while True: user_input = input("You: ") if user_input.lower() in ["exit", "quit", "stop"]: print("Chatbot: Goodbye!") break prompt = f"User: {user_input}\nChatbot:" response = generate_text(prompt) print(f"Chatbot: {response}")