starcoder
starcoder copied to clipboard
Home of StarCoder: fine-tuning & inference!
Looks like GPU usage almost doubles during saving (save_pretrained - get_peft_model_state_dict function). Is there a way to avoid this? stack trace: ```Traceback (most recent call last): File "finetune_starcoder.py", line 343,...
Hello, It is really exciting to see your work! May I know if the codes for fine-tuning on other programming languages will be released in the near future? Up to...
When I was training with ```chat/train.py```, I reported the following error after training: ``` Traceback (most recent call last): File "train.py", line 345, in main() File "train.py", line 313, in...
use A800 80g, how long it takes to finetune? I am stucking...
Hi all, I've set up Starcoder as follows: ``` gen_checkpoint = "bigcode/starcoder" gen_device = "cuda" gen_tokenizer, gen_model = setup_model_tokenizer( gen_checkpoint, bit_4=False, device=gen_device, bnb_config=None ) ``` ``` def setup_model_tokenizer( path, device=None,...
I'd just like you to know that code with permissive licensing with attribution requirements **are possibly unsuitable for training set inclusion.** I'm bringing this to your attention not as a...
Hi! Curious to know some more details about FIM and its effect on the pre-trained model. Here's a paragraph from the SantaCoder paper: > FIM for cheap We observe a...
When I tried to load starcoder based on tutorial provided, RuntimeError emerges because of CUDA error, but torch.cuda.is_available() returns True. The gpu that I run this on is provided below,...
See title. Temporarily, adding "peft==0.9.0" to requirements.txt and ignoring the readme.md instructions to install from git works around this issue. However, it would be better if huggingface/bigcode could coordinate between...