starcoder
starcoder copied to clipboard
Demo snippet pulls all checkpoints
trafficstars
tokenizer = AutoTokenizer.from_pretrained(checkpoint) as defined here - https://github.com/bigcode-project/starcoder#code-generation
pulls 7 checkpoint files, ~9GB each. Is this the intended behavior ?
Hi. Can you precise the checkpoint you are talking about? If you refer to starcoder, loading the tokenizer should not load any checkpoint file. In fact, this code snippet
from transformers import AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained("bigcode/starcoder")
takes a couple of seconds to run. Are you talking about the line model = AutoModelForCausalLM.from_pretrained(checkpoint) instead? If it is the case, yes it should load about 7 "shards", you can have a look at this.
This is the example I am running:
#### Code Generation example
from transformers import AutoModelForCausalLM, AutoTokenizer
checkpoint = "bigcode/starcoder"
device = "cuda" # for GPU usage or "cpu" for CPU usage
tokenizer = AutoTokenizer.from_pretrained(checkpoint)
# to save memory consider using fp16 or bf16 by specifying torch_dtype=torch.float16 for example
model = AutoModelForCausalLM.from_pretrained(checkpoint).to(device)
inputs = tokenizer.encode("def print_hello_world():", return_tensors="pt").to(device)
outputs = model.generate(inputs)
# clean_up_tokenization_spaces=False prevents a tokenizer edge case which can result in spaces being removed around punctuation
print(tokenizer.decode(outputs[0], clean_up_tokenization_spaces=False))