starcoder
starcoder copied to clipboard
Home of StarCoder: fine-tuning & inference!
I am trying to recreate this but while recreating when I use masking id -100 it is giving error device side assert triggered
Hi, while running on a Colab A100 instance I noticed that the VRAM consumed by finetune.py was only about 5 GB for starcoderbase-1b so I attempted it on my local...
when mutil gpu run starcoder in full parameter tuning , File "starcoder-git/finetune.py", line 44, in on_save kwargs["model"].save_pretrained(checkpoint_folder) File "/miniconda3/envs/sqlcode/lib/python3.9/site-packages/transformers/modeling_utils.py", line 2480, in save_pretrained os.remove(full_filename) FileNotFoundError: [Errno 2] No such file...
I am new to starcode. when I run the follow demo: ``` import torch from transformers import AutoTokenizer, AutoModelForCausalLM checkpoint = "./starcoder2-3b" tokenizer = AutoTokenizer.from_pretrained(checkpoint) model = AutoModelForCausalLM.from_pretrained(checkpoint, device_map="auto", torch_dtype=torch.bfloat16)...
It shows on the [paper](https://arxiv.org/html/2305.06161v2) (Section E.3) that we can put prompt prefixes with the `` token. My Question is, how do we handle this when the prompt we are...
Hello, I wish to reproduce the StarChat training for educational purposes, but I see the dataset (HuggingFaceH4/oasst1_en) has been removed. Is there any way to download it? If not, any...
FutureWarning: The `use_auth_token` argument is deprecated and will be removed in v5 of Transformers. Please use `token` instead.
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 288.00 MiB. GPU 0 has a total capacty of 21.99 GiB of which 59.00 MiB is free. Process 42083 has 21.92 GiB...
As titled, I have thoudsands of SQL files, I wish to fine tune the base model with these sqls to achieve FIM task.
I'm new to this area of Language models, in my use case I want to fine tune SQL coder model with spider dataset using this code base as this repo...