starcoder issues

Usage of LoadBestPeftModelCallback in Finetuning stage

1

Hi friends, I was trying to test the finetune/finetune.py script. It seems that state.best_model_checkpoint always return None leading to a failure at the end of the program. Is it that...

ttssp

generates nonsense for me?

1

hello after loading the model i asked it what are you able to generate and it responded with a question mark then I asked what project we were working on...

nyspsycho

Generating Embeddings of Code Tokens using StarCoder

1

I am exploring the possibility of using StarCoder to generate embeddings for code tokens and would like to know if this is feasible with the current implementation. ### Questions: 1....

code2graph

Fine-tuning Starcoder or Octocoder for IDE Integration: Instruction Tuning vs Base Model Training Approach

1

When aiming to fine-tune starcoder or octocoder on a custom dataset for integration with an IDE, would it be more appropriate to process the data in a question & answer...

JunHyungKang

Fix run finetune.py from torch.distributed.launch

To fix the `unrecognized arguments` problem, when running finetune.py from `torch.distributed.launch`. the argument `local_rank` needs to be changed to `local-rank`. launch command: ```shell python -m torch.distributed.launch --nproc_per_node=2 finetune.py --model_path xxx...

iohub

Model size doubles after .merge_and_unload() and .save_pretrained()

4

### My System Info peft==0.4.0 accelerate==0.18.0 transformers==4.28.0 py310 ### Reproduction After training, I merge the peft weights with base model using: ``` model_ft = PeftModel.from_pretrained( AutoModelForCausalLM.from_pretrained( base_model_path, return_dict=True, torch_dtype='auto', use_cache=True,...

anudeep-peela

What is the recommended GPU configuration to chat fine tune the full sequence length of 8192?

2

Even with a NVIDIA A100 80 GB GPU, I am not able to fine tune the model on full sequence length of 8192. I was not able to fine tune...

mathav95raj

Update train.py to match structure of HuggingFaceH4/oasst1_en

The HuggingFaceH4/oasst1_en dataset contains "train_idf" and "test_idf" instead of "train" and "test"

sarthak405

inference problem

Exception in thread Thread-7: Traceback (most recent call last): File "/data/starCoder/software/conda/envs/torch/lib/python3.8/threading.py", line 932, in _bootstrap_inner self.run() File "/data/starCoder/software/conda/envs/torch/lib/python3.8/threading.py", line 870, in run self._target(*self._args, **self._kwargs) File "/data/starCoder/software/conda/envs/torch/lib/python3.8/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context...

Maomaoxion

Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass,RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn

2

"I don't want to use 8-bit training. I hope to use fp16 training. After commenting out these two lines, there was an error. How should I modify it? In addition,...

lionday

starcoder
starcoder copied to clipboard

Metadata

Usage of LoadBestPeftModelCallback in Finetuning stage

generates nonsense for me?

Generating Embeddings of Code Tokens using StarCoder

Fine-tuning Starcoder or Octocoder for IDE Integration: Instruction Tuning vs Base Model Training Approach

Fix run finetune.py from torch.distributed.launch

Model size doubles after .merge_and_unload() and .save_pretrained()

What is the recommended GPU configuration to chat fine tune the full sequence length of 8192?

Update train.py to match structure of HuggingFaceH4/oasst1_en

inference problem

Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass,RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn

← Metadata

Owner

Metadata

starcoder starcoder copied to clipboard

Metadata

← Metadata

Owner

Metadata

starcoder
starcoder copied to clipboard