starcoder Demo snippet pulls all checkpoints

Demo snippet pulls all checkpoints

Open dhingratul opened this issue 2 years ago • 2 comments

trafficstars

tokenizer = AutoTokenizer.from_pretrained(checkpoint) as defined here - https://github.com/bigcode-project/starcoder#code-generation pulls 7 checkpoint files, ~9GB each. Is this the intended behavior ?

Aug 07 '23 23:08 dhingratul

Hi. Can you precise the checkpoint you are talking about? If you refer to starcoder, loading the tokenizer should not load any checkpoint file. In fact, this code snippet

from transformers import AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained("bigcode/starcoder")

takes a couple of seconds to run. Are you talking about the line model = AutoModelForCausalLM.from_pretrained(checkpoint) instead? If it is the case, yes it should load about 7 "shards", you can have a look at this.

Aug 09 '23 08:08 ArmelRandy

This is the example I am running:

#### Code Generation example
from transformers import AutoModelForCausalLM, AutoTokenizer

checkpoint = "bigcode/starcoder"
device = "cuda" # for GPU usage or "cpu" for CPU usage

tokenizer = AutoTokenizer.from_pretrained(checkpoint)
# to save memory consider using fp16 or bf16 by specifying torch_dtype=torch.float16 for example
model = AutoModelForCausalLM.from_pretrained(checkpoint).to(device)

inputs = tokenizer.encode("def print_hello_world():", return_tensors="pt").to(device)
outputs = model.generate(inputs)
# clean_up_tokenization_spaces=False prevents a tokenizer edge case which can result in spaces being removed around punctuation
print(tokenizer.decode(outputs[0], clean_up_tokenization_spaces=False))

Aug 09 '23 16:08 dhingratul

starcoder starcoder copied to clipboard

Demo snippet pulls all checkpoints

starcoder
starcoder copied to clipboard