starcoder
starcoder copied to clipboard
Fine tuning With SQLcoder-7b
I'm new to this area of Language models, in my use case I want to fine tune SQL coder model with spider dataset using this code base as this repo was working for me, while following the instructions given in the readme. I'm able to start training with Starcoder model with ArmelR/stack-exchange-instruction dataset.
I replaced python command with model path and also dataset name !python finetune/finetune.py --model_path="defog/sqlcoder-7b" --dataset_name="spider" --subset="data/finetune" --split="train" --size_valid_set 1000 --streaming --seq_length 1024 --max_steps 1000 --batch_size 1 --input_column_name="question" --output_column_name="query" --gradient_accumulation_steps 16 --learning_rate 1e-4 --lr_scheduler_type="cosine" --num_warmup_steps 100 --weight_decay 0.05 --output_dir="./checkpoints"
I'm facing an issue with attention mask shape while starting training, I know just by changing model path itself I couldn't directly just start training, please provide me some suggestions on starting the training.I'm providing link to my kaggle notebook here to get started. https://www.kaggle.com/code/bhrt16/notebookb5fd138c63
This is the log of the error
/opt/conda/lib/python3.10/site-packages/scipy/init.py:146: UserWarning: A NumPy version >=1.16.5 and <1.23.0 is required for this version of SciPy (detected version 1.24.3
warnings.warn(f"A NumPy version >={np_minversion} and <{np_maxversion}"
/opt/conda/lib/python3.10/site-packages/transformers/models/auto/tokenization_auto.py:691: FutureWarning: The use_auth_token argument is deprecated and will be removed in v5 of Transformers. Please use token instead.
warnings.warn(
tokenizer_config.json: 100%|███████████████████| 915/915 [00:00<00:00, 4.98MB/s]
tokenizer.model: 100%|███████████████████████| 493k/493k [00:00<00:00, 1.11MB/s]
tokenizer.json: 100%|██████████████████████| 1.80M/1.80M [00:00<00:00, 51.6MB/s]
special_tokens_map.json: 100%|████████████████| 72.0/72.0 [00:00<00:00, 448kB/s]
/opt/conda/lib/python3.10/site-packages/datasets/load.py:2088: FutureWarning: 'use_auth_token' was deprecated in favor of 'token' in version 2.14.0 and will be removed in 3.0.0.
You can remove this warning by passing 'token=<use_auth_token>' instead.
warnings.warn(
Loading the dataset in streaming mode
100%|████████████████████████████████████████| 400/400 [00:03<00:00, 110.05it/s]
The character to token ratio of the dataset is: 3.16
Loading the model
/opt/conda/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py:472: FutureWarning: The use_auth_token argument is deprecated and will be removed in v5 of Transformers. Please use token instead.
warnings.warn(
Loading checkpoint shards: 100%|██████████████████| 2/2 [01:07<00:00, 33.92s/it]
/opt/conda/lib/python3.10/site-packages/peft/utils/other.py:141: FutureWarning: prepare_model_for_int8_training is deprecated and will be removed in a future version. Use prepare_model_for_kbit_training instead.
warnings.warn(
trainable params: 41943040 || all params: 3794014208 || trainable%: 1.1055056122762943
Starting main loop
Training...
wandb: Currently logged in as: bhrt95. Use wandb login --relogin to force relogin
wandb: Tracking run with wandb version 0.16.1
wandb: Run data is saved locally in /kaggle/working/starcoder/wandb/run-20231213_114310-6pzqbs68
wandb: Run wandb offline to turn off syncing.
wandb: Syncing run StarCoder-finetuned
wandb: ⭐️ View project at https://wandb.ai/bhrt95/huggingface
wandb: 🚀 View run at https://wandb.ai/bhrt95/huggingface/runs/6pzqbs68
/opt/conda/lib/python3.10/site-packages/torch/utils/checkpoint.py:429: UserWarning: torch.utils.checkpoint: please pass in use_reentrant=True or use_reentrant=False explicitly. The default value of use_reentrant will be updated to be False in the future. To maintain current behavior, pass use_reentrant=True. It is recommended that you use use_reentrant=False. Refer to docs for more details on the differences between the two variants.
warnings.warn(
Traceback (most recent call last):
File "/kaggle/working/starcoder/finetune/finetune.py", line 326, in