finetune-gpt2xl icon indicating copy to clipboard operation
finetune-gpt2xl copied to clipboard

Guide: Finetune GPT2-XL (1.5 Billion Parameters) and finetune GPT-NEO (2.7 B) on a single GPU with Huggingface Transformers using DeepSpeed

Results 8 finetune-gpt2xl issues
Sort by recently updated
recently updated
newest added

Hi, I'm trying to train gpt2xl, but keep getting OOM, even when I set batch size to 1 and gradient_accumulation to 8\16\512, contigous_gradients false and allgather_bucket_size \ reduce_bucket_size 2e2. I...

I try to use your script (gpt2-xl) but I have an error: AttributeError: 'DeepSpeedCPUAdam' object has no attribute 'ds_opt_adam' pip list Package Version ------------------ --------- certifi 2021.5.30 charset-normalizer 2.0.4 click...

I got the following error: `[2022-01-13 14:47:32,154] [INFO] [launch.py:131:sigkill_handler] Killing subprocess 2273 Traceback (most recent call last): File "/home/ubuntu/anaconda3/envs/gpt2_lm/lib/python3.8/runpy.py", line 194, in _run_module_as_main return _run_code(code, main_globals, None, File "/home/ubuntu/anaconda3/envs/gpt2_lm/lib/python3.8/runpy.py", line...

Hello I'm interested in adding this feature anding a function in text2csv.py to take a folder of texts and then in run_clm.py pad and truncate them instead of the group_text...

I got this error: Traceback (most recent call last): File "run_clm.py", line 478, in main() File "run_clm.py", line 271, in main datasets = load_dataset( File "/root/miniconda3/lib/python3.8/site-packages/datasets/load.py", line 742, in load_dataset...

(gh_finetune-gpt2xl) r730ub20@r730ub20-M0:~/llm_dev/finetune-gpt2xl$ deepspeed --num_gpus=1 run_clm.py --deepspeed ds_config.json --model_name_or_path gpt2-xl --train_file train.csv --validation_file validation.csv --do_train --do_eval --fp16 --overwrite_cache --evaluation_strategy="steps" --output_dir finetuned --eval_steps 200 --num_train_epochs 1 --gradient_accumulation_steps 2 --per_device_train_batch_size 1 [2023-05-22 22:00:31,576]...

hi, this is not an issue but i was not sure where to post it. how can this tool be adapted for fine tuning GPT-J 6B?

Hi, I'd like to start with a big thanx for your amazing work. I would like to use your library to fine tune GPT-NEO to a Text2Text task instead of...