accelerate
accelerate copied to clipboard
OOM Error on fine-tuning gpt-j-6b
I am trying to fine-tune gpt-j-6b on wikitext data using the run_clm.py provieded here https://github.com/huggingface/transformers/tree/main/examples/pytorch/language-modeling
I am trying to launch the code as
accelerate launch run_clm.py \
--model_name_or_path EleutherAI/gpt-j-6b \
--dataset_name wikitext \
--dataset_config_name wikitext-2-raw-v1 \
--per_device_train_batch_size 8 \
--per_device_eval_batch_size 8 \
--do_train \
--do_eval \
--output_dir /tmp/test-clm
I have am trying to use fsdp to run this on Nvidia-A100 GPUs each with 40gb memory. I tried running on 2 and 4 GPUs still it is giving Out Of Memory error. I tried by minimising the train_batch_size and eval_batch_size to 1. Not sure if I can run it using accelerate library or not. I thought 40GB x4 A100 GPUs should be okay.
Could you provide us with the result of accelerate env as requested in the issue template? Also cc @pacman100
Hello @hsuyab, could you please provide the accelerate config that you are using?
Also, with accelerate launch you must use run_clm_no_trainer.py and not run_clm.py. run_clm.py uses Trainer's integration of FSDP with related args mentioned here: https://huggingface.co/docs/transformers/main_classes/trainer#pytorch-fully-sharded-data-parallel
This is the accelerate config I am using
compute_environment: LOCAL_MACHINE
distributed_type: FSDP
downcast_bf16: 'no'
fsdp_config:
fsdp_auto_wrap_policy: NO_WRAP
fsdp_backward_prefetch_policy: BACKWARD_PRE
fsdp_offload_params: false
fsdp_sharding_strategy: 1
fsdp_state_dict_type: FULL_STATE_DICT
machine_rank: 0
main_training_function: main
mixed_precision: bf16
num_machines: 0
num_processes: 4
rdzv_backend: static
same_network: true
tpu_env: []
tpu_use_cluster: false
tpu_use_sudo: false
use_cpu: false
Yes I tried out with run_clm_no_trainer.py as well. There the code block was as below:
accelerate launch run_clm_no_trainer.py \
--dataset_name wikitext \
--dataset_config_name wikitext-2-raw-v1 \
--model_name_or_path gpt2 \
--output_dir /tmp/test-clm
Can you see if I am doing anything wrong? Ideally it should work right?
could you please provide the accelerate config that you are using?
This is the accelerate config I am using
compute_environment: LOCAL_MACHINE distributed_type: FSDP downcast_bf16: 'no' fsdp_config: fsdp_auto_wrap_policy: NO_WRAP fsdp_backward_prefetch_policy: BACKWARD_PRE fsdp_offload_params: false fsdp_sharding_strategy: 1 fsdp_state_dict_type: FULL_STATE_DICT machine_rank: 0 main_training_function: main mixed_precision: bf16 num_machines: 0 num_processes: 4 rdzv_backend: static same_network: true tpu_env: [] tpu_use_cluster: false tpu_use_sudo: false use_cpu: false
Here is the accelerate config.
NO_WRAP won't do anything and is equivalent to DDP. Please use TRANSFORMER_BASED_WRAP, refer https://huggingface.co/docs/accelerate/usage_guides/fsdp
In this I could not find reference to what should be the auto wrap policy. Among the options should I go with GPTJBlock?
Yes, also, next time, please use the forums for suggestions or help as this isn't an issue per se. And make sure to provide the entire info required while raising the issue like the output of accelerate env, the accelerate config ... in case you raise the issue. It will help us to help you resolve the issue faster.
Hi I am getting this error: I used the GPTJBlock
Traceback (most recent call last):
File "run_clm_no_trainer.py", line 685, in <module>
main()
File "run_clm_no_trainer.py", line 510, in main
model, optimizer, train_dataloader, eval_dataloader, lr_scheduler = accelerator.prepare(
File "/opt/conda/lib/python3.8/site-packages/accelerate/accelerator.py", line 1122, in prepare
result = tuple(
File "/opt/conda/lib/python3.8/site-packages/accelerate/accelerator.py", line 1123, in <genexpr>
self._prepare_one(obj, first_pass=True, device_placement=d) for obj, d in zip(args, device_placement)
File "/opt/conda/lib/python3.8/site-packages/accelerate/accelerator.py", line 977, in _prepare_one
return self.prepare_model(obj, device_placement=device_placement)
File "/opt/conda/lib/python3.8/site-packages/accelerate/accelerator.py", line 1211, in prepare_model
self.state.fsdp_plugin.set_auto_wrap_policy(model)
File "/opt/conda/lib/python3.8/site-packages/accelerate/utils/dataclasses.py", line 836, in set_auto_wrap_policy
raise Exception("Could not find the transformer layer class to wrap in the model.")
Exception: Could not find the transformer layer class to wrap in the model.
Traceback (most recent call last):
File "run_clm_no_trainer.py", line 685, in <module>
main()
File "run_clm_no_trainer.py", line 510, in main
model, optimizer, train_dataloader, eval_dataloader, lr_scheduler = accelerator.prepare(
File "/opt/conda/lib/python3.8/site-packages/accelerate/accelerator.py", line 1122, in prepare
result = tuple(
File "/opt/conda/lib/python3.8/site-packages/accelerate/accelerator.py", line 1123, in <genexpr>
self._prepare_one(obj, first_pass=True, device_placement=d) for obj, d in zip(args, device_placement)
File "/opt/conda/lib/python3.8/site-packages/accelerate/accelerator.py", line 977, in _prepare_one
return self.prepare_model(obj, device_placement=device_placement)
File "/opt/conda/lib/python3.8/site-packages/accelerate/accelerator.py", line 1211, in prepare_model
self.state.fsdp_plugin.set_auto_wrap_policy(model)
File "/opt/conda/lib/python3.8/site-packages/accelerate/utils/dataclasses.py", line 836, in set_auto_wrap_policy
raise Exception("Could not find the transformer layer class to wrap in the model.")
Exception: Could not find the transformer layer class to wrap in the model.
ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 1) local_rank: 0 (pid: 1512) of binary: /opt/conda/bin/python3
Traceback (most recent call last):
File "/opt/conda/bin/accelerate", line 8, in <module>
sys.exit(main())
File "/opt/conda/lib/python3.8/site-packages/accelerate/commands/accelerate_cli.py", line 45, in main
args.func(args)
File "/opt/conda/lib/python3.8/site-packages/accelerate/commands/launch.py", line 910, in launch_command
multi_gpu_launcher(args)
File "/opt/conda/lib/python3.8/site-packages/accelerate/commands/launch.py", line 603, in multi_gpu_launcher
distrib_run.run(args)
File "/opt/conda/lib/python3.8/site-packages/torch/distributed/run.py", line 785, in run
elastic_launch(
File "/opt/conda/lib/python3.8/site-packages/torch/distributed/launcher/api.py", line 134, in __call__
return launch_agent(self._config, self._entrypoint, list(args))
File "/opt/conda/lib/python3.8/site-packages/torch/distributed/launcher/api.py", line 250, in launch_agent
raise ChildFailedError(
torch.distributed.elastic.multiprocessing.errors.ChildFailedError
I thought you were using GPTJ but the command above is using gpt2 . Please refer the modeling code in transformers for getting the transformer block name. For GPT2, it would be GPT2Block
Yes, also, next time, please use the forums for suggestions or help as this isn't an issue per se. And make sure to provide the entire info required while raising the issue like the output of
accelerate env, the accelerate config ... in case you raise the issue. It will help us to help you resolve the issue faster.
Sure, this was my first doubt, I raised here itself. Here is the accelerate env:
- `Accelerate` version: 0.18.0
- Platform: Linux-5.4.0-136-generic-x86_64-with-glibc2.17
- Python version: 3.8.11
- Numpy version: 1.19.5
- PyTorch version (GPU?): 2.0.0+cu117 (True)
- `Accelerate` default config:
- compute_environment: LOCAL_MACHINE
- distributed_type: FSDP
- mixed_precision: bf16
- use_cpu: False
- num_processes: 2
- machine_rank: 0
- num_machines: 0
- rdzv_backend: static
- same_network: True
- main_training_function: main
- fsdp_config: {'fsdp_auto_wrap_policy': 'TRANSFORMER_BASED_WRAP', 'fsdp_backward_prefetch_policy': 'BACKWARD_PRE', 'fsdp_offload_params': False, 'fsdp_sharding_strategy': 1, 'fsdp_state_dict_type': 'FULL_STATE_DICT', 'fsdp_transformer_layer_cls_to_wrap': '`GPTJBlock`'}
- downcast_bf16: no
- tpu_use_cluster: False
- tpu_use_sudo: False
- tpu_env: []
By forums, I meant https://discuss.huggingface.co/c/accelerate/18
I thought you were using GPTJ but the command above is using
gpt2. Please refer the modeling code in transformers for getting the transformer block name. For GPT2, it would beGPT2Block
sorry, I pasted the wrong code. I used this:
accelerate launch run_clm_no_trainer.py \
--dataset_name wikitext \
--dataset_config_name wikitext-2-raw-v1 \
--model_name_or_path EleutherAI/gpt-j-6b \
--output_dir /tmp/test-clm
remove "`" from the name in config 'GPTJBlock', it should be 'GPTJBlock'
Hi I modified that and I got
RuntimeError: CUDA error: CUBLAS_STATUS_NOT_INITIALIZED when calling 'cublasCreate(handle)'
@pacman100 Updated accelerate env:
Copy-and-paste the text below in your GitHub issue
- `Accelerate` version: 0.18.0
- Platform: Linux-5.4.0-136-generic-x86_64-with-glibc2.10
- Python version: 3.8.12
- Numpy version: 1.22.2
- PyTorch version (GPU?): 1.13.1+cu116 (True)
- `Accelerate` default config:
- compute_environment: LOCAL_MACHINE
- distributed_type: FSDP
- mixed_precision: bf16
- use_cpu: False
- num_processes: 6
- machine_rank: 0
- num_machines: 0
- rdzv_backend: static
- same_network: True
- main_training_function: main
- fsdp_config: {'fsdp_auto_wrap_policy': 'TRANSFORMER_BASED_WRAP', 'fsdp_backward_prefetch_policy': 'BACKWARD_PRE', 'fsdp_offload_params': False, 'fsdp_sharding_strategy': 1, 'fsdp_state_dict_type': 'FULL_STATE_DICT', 'fsdp_transformer_layer_cls_to_wrap': 'GPTJBlock'}
- downcast_bf16: no
- tpu_use_cluster: False
- tpu_use_sudo: False
- tpu_env: []
Code run: Here I tried to run on 6 A100 GPUs (40GB each)
accelerate launch run_clm_no_trainer.py \
--dataset_name wikitext \
--per_device_train_batch_size 1 \
--per_device_valid_batch_size 1 \
--gradient_accumulation_steps 8 \
--dataset_config_name wikitext-2-raw-v1 \
--model_name_or_path EleutherAI/gpt-j-6b \
--output_dir /tmp/test-clm
Error file: error_output.txt
@pacman100 Updated accelerate env:
Copy-and-paste the text below in your GitHub issue - `Accelerate` version: 0.18.0 - Platform: Linux-5.4.0-136-generic-x86_64-with-glibc2.10 - Python version: 3.8.12 - Numpy version: 1.22.2 - PyTorch version (GPU?): 1.13.1+cu116 (True) - `Accelerate` default config: - compute_environment: LOCAL_MACHINE - distributed_type: FSDP - mixed_precision: bf16 - use_cpu: False - num_processes: 6 - machine_rank: 0 - num_machines: 0 - rdzv_backend: static - same_network: True - main_training_function: main - fsdp_config: {'fsdp_auto_wrap_policy': 'TRANSFORMER_BASED_WRAP', 'fsdp_backward_prefetch_policy': 'BACKWARD_PRE', 'fsdp_offload_params': False, 'fsdp_sharding_strategy': 1, 'fsdp_state_dict_type': 'FULL_STATE_DICT', 'fsdp_transformer_layer_cls_to_wrap': 'GPTJBlock'} - downcast_bf16: no - tpu_use_cluster: False - tpu_use_sudo: False - tpu_env: []Code run: Here I tried to run on 6 A100 GPUs (40GB each)
accelerate launch run_clm_no_trainer.py \ --dataset_name wikitext \ --per_device_train_batch_size 1 \ --per_device_valid_batch_size 1 \ --gradient_accumulation_steps 8 \ --dataset_config_name wikitext-2-raw-v1 \ --model_name_or_path EleutherAI/gpt-j-6b \ --output_dir /tmp/test-clmError file: error_output.txt
Hi @pacman100 can you check this once. I am still facing OOM error in this use-case.
@sgugger hi can you look at this?
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.
Please note that issues that do not follow the contributing guidelines are likely to be ignored.
I still haven't figured out the solution however I am closing it.
I can confirm the above code example runs on 2 A100s 80GB,
config:
- `Accelerate` version: 0.21.0.dev0
- Platform: Linux-5.4.0-125-generic-x86_64-with-glibc2.31
- Python version: 3.11.3
- Numpy version: 1.24.3
- PyTorch version (GPU?): 2.1.0.dev20230620 (True)
- PyTorch XPU available: False
- System RAM: 503.55 GB
- GPU type: NVIDIA A100-SXM4-80GB
- `Accelerate` default config:
- compute_environment: LOCAL_MACHINE
- distributed_type: FSDP
- mixed_precision: bf16
- use_cpu: False
- num_processes: 2
- machine_rank: 0
- num_machines: 1
- rdzv_backend: static
- same_network: True
- main_training_function: main
- fsdp_config: {'fsdp_auto_wrap_policy': 'TRANSFORMER_BASED_WRAP', 'fsdp_backward_prefetch_policy': 'BACKWARD_PRE', 'fsdp_forward_prefetch': False, 'fsdp_offload_params': False, 'fsdp_sharding_strategy': 1, 'fsdp_state_dict_type': 'SHARDED_STATE_DICT', 'fsdp_sync_module_states': False, 'fsdp_transformer_layer_cls_to_wrap': 'GPTJBlock', 'fsdp_use_orig_params': True}
- downcast_bf16: no
- tpu_use_cluster: False
- tpu_use_sudo: False
- tpu_env: []
command:
accelerate launch run_clm_no_trainer.py --dataset_name wikitext --per_device_train_batch_size 1 --per_device_eval_batch_size 1 --gradient_accumulation_steps 8 --dataset_config_name wikitext-2-raw-v1 --model_name_or_path EleutherAI/gpt-j-6b --output_dir /tmp/test-clm
output logs:
[2023-06-22 13:40:09,980] [INFO] [real_accelerator.py:110:get_accelerator] Setting ds_accelerator to cuda (auto detect)
[2023-06-22 13:40:12,920] [INFO] [real_accelerator.py:110:get_accelerator] Setting ds_accelerator to cuda (auto detect)
[2023-06-22 13:40:12,963] [INFO] [real_accelerator.py:110:get_accelerator] Setting ds_accelerator to cuda (auto detect)
06/22/2023 13:40:14 - INFO - __main__ - Distributed environment: DistributedType.FSDP Backend: nccl
Num processes: 2
Process index: 1
Local process index: 1
Device: cuda:1
Mixed precision type: bf16
06/22/2023 13:40:14 - INFO - __main__ - Distributed environment: DistributedType.FSDP Backend: nccl
Num processes: 2
Process index: 0
Local process index: 0
Device: cuda:0
Mixed precision type: bf16
100%|██████████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:00<00:00, 1341.32it/s]
06/22/2023 13:40:16 - WARNING - datasets.builder - Found cached dataset wikitext (/raid/sourab/.cache/huggingface/datasets/wikitext/wikitext-2-raw-v1/1.0.0/a241db52902eaf2c6aa732210bead40c090019a499ceb13bcbfa3f8ab646a126)
100%|██████████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:00<00:00, 1424.21it/s]
loading configuration file config.json from cache at /raid/sourab/.cache/huggingface/models--EleutherAI--gpt-j-6b/snapshots/47e169305d2e8376be1d31e765533382721b2cc1/config.json
Model config GPTJConfig {
"_name_or_path": "EleutherAI/gpt-j-6b",
"activation_function": "gelu_new",
"architectures": [
"GPTJForCausalLM"
],
"attn_pdrop": 0.0,
"bos_token_id": 50256,
"embd_pdrop": 0.0,
"eos_token_id": 50256,
"gradient_checkpointing": false,
"initializer_range": 0.02,
"layer_norm_epsilon": 1e-05,
"model_type": "gptj",
"n_embd": 4096,
"n_head": 16,
"n_inner": null,
"n_layer": 28,
"n_positions": 2048,
"resid_pdrop": 0.0,
"rotary": true,
"rotary_dim": 64,
"scale_attn_weights": true,
"summary_activation": null,
"summary_first_dropout": 0.1,
"summary_proj_to_labels": true,
"summary_type": "cls_index",
"summary_use_proj": true,
"task_specific_params": {
"text-generation": {
"do_sample": true,
"max_length": 50,
"temperature": 1.0
}
},
"tie_word_embeddings": false,
"tokenizer_class": "GPT2Tokenizer",
"transformers_version": "4.31.0.dev0",
"use_cache": true,
"vocab_size": 50400
}
loading file vocab.json from cache at /raid/sourab/.cache/huggingface/models--EleutherAI--gpt-j-6b/snapshots/47e169305d2e8376be1d31e765533382721b2cc1/vocab.json
loading file merges.txt from cache at /raid/sourab/.cache/huggingface/models--EleutherAI--gpt-j-6b/snapshots/47e169305d2e8376be1d31e765533382721b2cc1/merges.txt
loading file tokenizer.json from cache at /raid/sourab/.cache/huggingface/models--EleutherAI--gpt-j-6b/snapshots/47e169305d2e8376be1d31e765533382721b2cc1/tokenizer.json
loading file added_tokens.json from cache at /raid/sourab/.cache/huggingface/models--EleutherAI--gpt-j-6b/snapshots/47e169305d2e8376be1d31e765533382721b2cc1/added_tokens.json
loading file special_tokens_map.json from cache at /raid/sourab/.cache/huggingface/models--EleutherAI--gpt-j-6b/snapshots/47e169305d2e8376be1d31e765533382721b2cc1/special_tokens_map.json
loading file tokenizer_config.json from cache at /raid/sourab/.cache/huggingface/models--EleutherAI--gpt-j-6b/snapshots/47e169305d2e8376be1d31e765533382721b2cc1/tokenizer_config.json
loading weights file pytorch_model.bin from cache at /raid/sourab/.cache/huggingface/models--EleutherAI--gpt-j-6b/snapshots/47e169305d2e8376be1d31e765533382721b2cc1/pytorch_model.bin
Generate config GenerationConfig {
"_from_model_config": true,
"bos_token_id": 50256,
"eos_token_id": 50256,
"transformers_version": "4.31.0.dev0"
}
All model checkpoint weights were used when initializing GPTJForCausalLM.
All the weights of GPTJForCausalLM were initialized from the model checkpoint at EleutherAI/gpt-j-6b.
If your task is similar to the task the model of the checkpoint was trained on, you can already use GPTJForCausalLM for predictions without further training.
Generation config file not found, using a generation config created from the model config.
06/22/2023 13:41:06 - WARNING - datasets.arrow_dataset - Loading cached processed dataset at /raid/sourab/.cache/huggingface/datasets/wikitext/wikitext-2-raw-v1/1.0.0/a241db52902eaf2c6aa732210bead40c090019a499ceb13bcbfa3f8ab646a126/cache-76793d46c8ad9166.arrow
06/22/2023 13:41:06 - WARNING - datasets.arrow_dataset - Loading cached processed dataset at /raid/sourab/.cache/huggingface/datasets/wikitext/wikitext-2-raw-v1/1.0.0/a241db52902eaf2c6aa732210bead40c090019a499ceb13bcbfa3f8ab646a126/cache-3fd6912b02345416.arrow
06/22/2023 13:41:06 - WARNING - datasets.arrow_dataset - Loading cached processed dataset at /raid/sourab/.cache/huggingface/datasets/wikitext/wikitext-2-raw-v1/1.0.0/a241db52902eaf2c6aa732210bead40c090019a499ceb13bcbfa3f8ab646a126/cache-4fe7c1e41fc39277.arrow
06/22/2023 13:41:06 - WARNING - __main__ - The chosen tokenizer supports a `model_max_length` that is longer than the default `block_size` value of 1024. If you would like to use a longer `block_size` up to `tokenizer.model_max_length` you can override this default with `--block_size xxx`.
06/22/2023 13:41:06 - WARNING - datasets.arrow_dataset - Loading cached processed dataset at /raid/sourab/.cache/huggingface/datasets/wikitext/wikitext-2-raw-v1/1.0.0/a241db52902eaf2c6aa732210bead40c090019a499ceb13bcbfa3f8ab646a126/cache-b9b8b2fd6c69082c.arrow
06/22/2023 13:41:06 - WARNING - datasets.arrow_dataset - Loading cached processed dataset at /raid/sourab/.cache/huggingface/datasets/wikitext/wikitext-2-raw-v1/1.0.0/a241db52902eaf2c6aa732210bead40c090019a499ceb13bcbfa3f8ab646a126/cache-cf0b5700aab01e12.arrow
06/22/2023 13:41:06 - WARNING - datasets.arrow_dataset - Loading cached processed dataset at /raid/sourab/.cache/huggingface/datasets/wikitext/wikitext-2-raw-v1/1.0.0/a241db52902eaf2c6aa732210bead40c090019a499ceb13bcbfa3f8ab646a126/cache-99e8c4ce56ab7f5b.arrow
06/22/2023 13:41:07 - INFO - __main__ - Sample 1731 of the training set: {'input_ids': [5645, 286, 262, 7894, 27400, 1690, 1716, 12666, 422, 852, 17451, 17901, 319, 262, 2323, 764, 220, 198, 383, 4048, 318, 3538, 18876, 422, 262, 4257, 1058, 673, 468, 257, 517, 7135, 290, 1342, 2401, 276, 1182, 837, 607, 307, 461, 318, 40369, 290, 9823, 453, 2392, 837, 607, 6736, 290, 18216, 41408, 4833, 837, 607, 7405, 290, 3625, 517, 36808, 290, 11398, 680, 13791, 837, 290, 607, 7894, 9823, 453, 2392, 764, 2893, 607, 22802, 496, 9568, 318, 407, 845, 1180, 422, 326, 286, 262, 4257, 837, 262, 5680, 278, 318, 517, 11800, 837, 351, 1342, 7872, 290, 285, 1252, 1359, 764, 1375, 12444, 284, 4180, 517, 290, 307, 517, 8361, 621, 262, 4257, 618, 12118, 764, 317, 46282, 4048, 635, 468, 257, 50167, 8529, 319, 262, 6247, 4168, 286, 262, 19921, 764, 220, 198, 4525, 867, 1582, 24744, 837, 262, 479, 461, 41817, 468, 257, 4996, 286, 3848, 764, 1081, 880, 355, 262, 1489, 3150, 357, 766, 2174, 329, 257, 8296, 1267, 290, 442, 654, 286, 511, 31993, 3848, 837, 340, 481, 1690, 1341, 430, 668, 284, 5453, 663, 4067, 284, 584, 10087, 764, 220, 198, 383, 479, 461, 41817, 468, 257, 880, 2488, 12, 31, 4166, 2565, 286, 8508, 837, 543, 1224, 902, 663, 645, 310, 35735, 12263, 764, 632, 460, 28433, 1871, 16298, 4662, 981, 329, 3039, 837, 257, 9172, 2098, 329, 691, 530, 584, 1582, 10599, 4693, 764, 1881, 286, 262, 749, 8871, 9695, 286, 262, 479, 461, 41817, 318, 663, 15497, 290, 3665, 16298, 454, 837, 543, 468, 587, 3417, 355, 1276, 88, 764, 11259, 262, 479, 461, 41817, 705, 82, 880, 2488, 12, 31, 4166, 2565, 286, 8508, 837, 428, 21212, 743, 307, 257, 1919, 4607, 418, 570, 282, 764, 383, 8508, 1690, 21675, 23311, 284, 262, 5688, 9366, 1203, 479, 461, 41817, 764, 220, 198, 796, 796, 796, 28880, 9145, 796, 796, 796, 220, 198, 383, 18328, 286, 262, 479, 461, 41817, 24242, 422, 584, 1582, 24744, 287, 1811, 3033, 3917, 351, 5474, 17587, 764, 41039, 837, 340, 468, 262, 18197, 3585, 8539, 2546, 286, 597, 1582, 10599, 764, 6363, 8539, 27400, 389, 12238, 837, 517, 19273, 837, 290, 1342, 30372, 34546, 290, 423, 7380, 1233, 282, 42577, 5028, 284, 5793, 262, 27400, 1978, 764, 383, 26370, 388, 318, 1402, 290, 468, 257, 1877, 837, 19750, 328, 498, 885, 417, 290, 257, 34464, 599, 1437, 409, 759, 64, 764, 1081, 287, 584, 5474, 1203, 10087, 290, 617, 584, 5474, 276, 1582, 24744, 837, 262, 9230, 3129, 64, 318, 407, 43954, 475, 10874, 286, 257, 5166, 286, 537, 615, 2983, 9105, 287, 2800, 351, 1123, 1162, 330, 1868, 764, 1081, 287, 584, 5474, 1203, 10087, 837, 262, 9848, 1022, 262, 1162, 330, 1868, 290, 26370, 388, 318, 37287, 764, 383, 479, 461, 41817, 468, 257, 4025, 16176, 4703, 621, 584, 1582, 24744, 764, 383, 14793, 4402, 11945, 286, 262, 1232, 290, 3211, 389, 28719, 890, 290, 262, 1233, 282, 4847, 389, 28719, 1790, 764, 220, 198, 383, 279, 478, 6864, 1928, 3129, 1300, 286, 262, 479, 461, 41817, 318, 635, 9518, 416, 5474, 17587, 764, 383, 279, 478, 6864, 271, 290, 424, 1050, 330, 273, 10602, 485, 385, 12749, 389, 9257, 5322, 764, 383, 2632, 265, 363, 498, 271, 4327, 78, 890, 385, 468, 645, 7310, 8280, 19921, 764, 383, 26370, 420, 273, 10602, 485, 385, 318, 4327, 29823, 764, 1318, 318, 281, 7667, 38421, 934, 271, 1451, 11815, 537, 615, 13174, 271, 8280, 326, 318, 3917, 351, 262, 1588, 13833, 764, 220, 198, 796, 796, 39978, 290, 9172, 796, 796, 220, 198, 632, 2331, 326, 262, 479, 461, 41817, 784, 588, 867, 286, 968, 8936, 705, 82, 6512, 4693, 784, 468, 12572, 284, 22265, 281, 25047, 21404, 7685, 5901, 416, 2972, 4693, 286, 46103, 357, 262, 691, 1729, 2488, 12, 31, 16050, 23426, 6868, 284, 968, 8936, 389, 1115, 4693, 286, 1402, 19553, 1267, 764, 7413, 262, 10325, 286, 5384, 837, 262, 479, 461, 41817, 373, 9387, 3690, 262, 1115, 1388, 14807, 286, 968, 8936, 764, 632, 5615, 287, 257, 4996, 286, 35308, 837, 1390, 256, 1046, 735, 4447, 837, 27268, 4447, 290, 17475, 3006, 764, 632, 635, 30671, 17039, 837, 1390, 883, 13354, 416, 24573, 420, 283, 862, 357, 20254, 84, 837, 285, 1045, 72, 837, 479, 993, 1134, 378, 64, 837, 2006, 3301, 1267, 837, 307, 16672, 837, 256, 6909, 837, 290, 374, 1045, 764, 554, 23238, 585, 1044, 837, 3006, 286, 44128, 290, 13819, 16468, 351, 16935, 803, 290, 7272, 12658, 1780, 28459, 784, 884, 355, 1936, 7660, 837, 8237, 8396, 837, 24484, 6853, 837, 9732, 84, 837, 339, 12636, 837, 290, 2243, 4951, 5356, 784, 2627, 1900, 355, 366, 479, 461, 41817, 25476, 366, 764, 220, 198, 383, 479, 461, 41817, 318, 7525, 645, 310, 35735, 2162, 340, 686, 455, 82, 739, 3002, 287, 7150, 393, 319, 262, 2323, 1141, 262, 1110, 290, 6100, 1088, 663, 16771, 379, 1755, 764, 220, 198, 7486, 262, 479, 461, 41817, 2314, 6129, 837, 340, 318, 281, 6275, 5424, 527, 837, 41988, 284, 262, 12389, 82, 286, 262, 38760, 7150, 764, 632, 460, 635, 366, 41015, 366, 784, 31491, 416, 45583, 290, 14342, 663, 12098, 764, 554, 428, 835, 340, 743, 3067, 257, 1178, 18985, 357, 5695, 1267, 379, 281, 9848, 286, 1342, 621, 4153, 7370, 764, 220, 198, 11136, 2626, 262, 2694, 284, 6129, 837, 340, 468, 4166, 1913, 7405, 764, 15477, 318, 1690, 416, 835, 286, 257, 5801, 366, 48342, 2488, 12, 31, 588, 366, 308, 4548, 416, 543, 340, 460, 1445, 867, 23990, 764, 317, 4048, 468, 587, 6515, 1642, 734, 1441, 13229, 1123, 1755, 1141, 46282, 422, 607, 16343, 284, 257, 2057, 2723, 510, 284, 352, 10571, 357, 657, 2488, 13, 31, 718, 21504, 1267, 1497, 290, 262, 4257, 743, 2513, 422, 663, 1363, 2837, 284, 257, 31993, 13478, 510, 284, 642, 10571, 357, 513, 21504, 1267, 1497, 1141, 262, 31993, 1622, 357, 3267, 784, 3269, 1267, 764, 220, 198, 6960, 10087, 36941, 287, 711, 4330, 837, 290, 530, 6512, 481, 1690, 5793, 262, 7393, 286, 1194, 739, 663, 22531, 764, 383, 479, 461, 41817, 318, 11040, 416, 3450, 290, 468, 587, 1900, 284, 9427, 351, 5384, 764, 23702, 3085, 290, 11661], 'attention_mask': [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1], 'labels': [5645, 286, 262, 7894, 27400, 1690, 1716, 12666, 422, 852, 17451, 17901, 319, 262, 2323, 764, 220, 198, 383, 4048, 318, 3538, 18876, 422, 262, 4257, 1058, 673, 468, 257, 517, 7135, 290, 1342, 2401, 276, 1182, 837, 607, 307, 461, 318, 40369, 290, 9823, 453, 2392, 837, 607, 6736, 290, 18216, 41408, 4833, 837, 607, 7405, 290, 3625, 517, 36808, 290, 11398, 680, 13791, 837, 290, 607, 7894, 9823, 453, 2392, 764, 2893, 607, 22802, 496, 9568, 318, 407, 845, 1180, 422, 326, 286, 262, 4257, 837, 262, 5680, 278, 318, 517, 11800, 837, 351, 1342, 7872, 290, 285, 1252, 1359, 764, 1375, 12444, 284, 4180, 517, 290, 307, 517, 8361, 621, 262, 4257, 618, 12118, 764, 317, 46282, 4048, 635, 468, 257, 50167, 8529, 319, 262, 6247, 4168, 286, 262, 19921, 764, 220, 198, 4525, 867, 1582, 24744, 837, 262, 479, 461, 41817, 468, 257, 4996, 286, 3848, 764, 1081, 880, 355, 262, 1489, 3150, 357, 766, 2174, 329, 257, 8296, 1267, 290, 442, 654, 286, 511, 31993, 3848, 837, 340, 481, 1690, 1341, 430, 668, 284, 5453, 663, 4067, 284, 584, 10087, 764, 220, 198, 383, 479, 461, 41817, 468, 257, 880, 2488, 12, 31, 4166, 2565, 286, 8508, 837, 543, 1224, 902, 663, 645, 310, 35735, 12263, 764, 632, 460, 28433, 1871, 16298, 4662, 981, 329, 3039, 837, 257, 9172, 2098, 329, 691, 530, 584, 1582, 10599, 4693, 764, 1881, 286, 262, 749, 8871, 9695, 286, 262, 479, 461, 41817, 318, 663, 15497, 290, 3665, 16298, 454, 837, 543, 468, 587, 3417, 355, 1276, 88, 764, 11259, 262, 479, 461, 41817, 705, 82, 880, 2488, 12, 31, 4166, 2565, 286, 8508, 837, 428, 21212, 743, 307, 257, 1919, 4607, 418, 570, 282, 764, 383, 8508, 1690, 21675, 23311, 284, 262, 5688, 9366, 1203, 479, 461, 41817, 764, 220, 198, 796, 796, 796, 28880, 9145, 796, 796, 796, 220, 198, 383, 18328, 286, 262, 479, 461, 41817, 24242, 422, 584, 1582, 24744, 287, 1811, 3033, 3917, 351, 5474, 17587, 764, 41039, 837, 340, 468, 262, 18197, 3585, 8539, 2546, 286, 597, 1582, 10599, 764, 6363, 8539, 27400, 389, 12238, 837, 517, 19273, 837, 290, 1342, 30372, 34546, 290, 423, 7380, 1233, 282, 42577, 5028, 284, 5793, 262, 27400, 1978, 764, 383, 26370, 388, 318, 1402, 290, 468, 257, 1877, 837, 19750, 328, 498, 885, 417, 290, 257, 34464, 599, 1437, 409, 759, 64, 764, 1081, 287, 584, 5474, 1203, 10087, 290, 617, 584, 5474, 276, 1582, 24744, 837, 262, 9230, 3129, 64, 318, 407, 43954, 475, 10874, 286, 257, 5166, 286, 537, 615, 2983, 9105, 287, 2800, 351, 1123, 1162, 330, 1868, 764, 1081, 287, 584, 5474, 1203, 10087, 837, 262, 9848, 1022, 262, 1162, 330, 1868, 290, 26370, 388, 318, 37287, 764, 383, 479, 461, 41817, 468, 257, 4025, 16176, 4703, 621, 584, 1582, 24744, 764, 383, 14793, 4402, 11945, 286, 262, 1232, 290, 3211, 389, 28719, 890, 290, 262, 1233, 282, 4847, 389, 28719, 1790, 764, 220, 198, 383, 279, 478, 6864, 1928, 3129, 1300, 286, 262, 479, 461, 41817, 318, 635, 9518, 416, 5474, 17587, 764, 383, 279, 478, 6864, 271, 290, 424, 1050, 330, 273, 10602, 485, 385, 12749, 389, 9257, 5322, 764, 383, 2632, 265, 363, 498, 271, 4327, 78, 890, 385, 468, 645, 7310, 8280, 19921, 764, 383, 26370, 420, 273, 10602, 485, 385, 318, 4327, 29823, 764, 1318, 318, 281, 7667, 38421, 934, 271, 1451, 11815, 537, 615, 13174, 271, 8280, 326, 318, 3917, 351, 262, 1588, 13833, 764, 220, 198, 796, 796, 39978, 290, 9172, 796, 796, 220, 198, 632, 2331, 326, 262, 479, 461, 41817, 784, 588, 867, 286, 968, 8936, 705, 82, 6512, 4693, 784, 468, 12572, 284, 22265, 281, 25047, 21404, 7685, 5901, 416, 2972, 4693, 286, 46103, 357, 262, 691, 1729, 2488, 12, 31, 16050, 23426, 6868, 284, 968, 8936, 389, 1115, 4693, 286, 1402, 19553, 1267, 764, 7413, 262, 10325, 286, 5384, 837, 262, 479, 461, 41817, 373, 9387, 3690, 262, 1115, 1388, 14807, 286, 968, 8936, 764, 632, 5615, 287, 257, 4996, 286, 35308, 837, 1390, 256, 1046, 735, 4447, 837, 27268, 4447, 290, 17475, 3006, 764, 632, 635, 30671, 17039, 837, 1390, 883, 13354, 416, 24573, 420, 283, 862, 357, 20254, 84, 837, 285, 1045, 72, 837, 479, 993, 1134, 378, 64, 837, 2006, 3301, 1267, 837, 307, 16672, 837, 256, 6909, 837, 290, 374, 1045, 764, 554, 23238, 585, 1044, 837, 3006, 286, 44128, 290, 13819, 16468, 351, 16935, 803, 290, 7272, 12658, 1780, 28459, 784, 884, 355, 1936, 7660, 837, 8237, 8396, 837, 24484, 6853, 837, 9732, 84, 837, 339, 12636, 837, 290, 2243, 4951, 5356, 784, 2627, 1900, 355, 366, 479, 461, 41817, 25476, 366, 764, 220, 198, 383, 479, 461, 41817, 318, 7525, 645, 310, 35735, 2162, 340, 686, 455, 82, 739, 3002, 287, 7150, 393, 319, 262, 2323, 1141, 262, 1110, 290, 6100, 1088, 663, 16771, 379, 1755, 764, 220, 198, 7486, 262, 479, 461, 41817, 2314, 6129, 837, 340, 318, 281, 6275, 5424, 527, 837, 41988, 284, 262, 12389, 82, 286, 262, 38760, 7150, 764, 632, 460, 635, 366, 41015, 366, 784, 31491, 416, 45583, 290, 14342, 663, 12098, 764, 554, 428, 835, 340, 743, 3067, 257, 1178, 18985, 357, 5695, 1267, 379, 281, 9848, 286, 1342, 621, 4153, 7370, 764, 220, 198, 11136, 2626, 262, 2694, 284, 6129, 837, 340, 468, 4166, 1913, 7405, 764, 15477, 318, 1690, 416, 835, 286, 257, 5801, 366, 48342, 2488, 12, 31, 588, 366, 308, 4548, 416, 543, 340, 460, 1445, 867, 23990, 764, 317, 4048, 468, 587, 6515, 1642, 734, 1441, 13229, 1123, 1755, 1141, 46282, 422, 607, 16343, 284, 257, 2057, 2723, 510, 284, 352, 10571, 357, 657, 2488, 13, 31, 718, 21504, 1267, 1497, 290, 262, 4257, 743, 2513, 422, 663, 1363, 2837, 284, 257, 31993, 13478, 510, 284, 642, 10571, 357, 513, 21504, 1267, 1497, 1141, 262, 31993, 1622, 357, 3267, 784, 3269, 1267, 764, 220, 198, 6960, 10087, 36941, 287, 711, 4330, 837, 290, 530, 6512, 481, 1690, 5793, 262, 7393, 286, 1194, 739, 663, 22531, 764, 383, 479, 461, 41817, 318, 11040, 416, 3450, 290, 468, 587, 1900, 284, 9427, 351, 5384, 764, 23702, 3085, 290, 11661]}.
06/22/2023 13:41:07 - INFO - __main__ - Sample 1429 of the training set: {'input_ids': [475, 26532, 10872, 5081, 366, 775, 705, 303, 2722, 645, 2897, 422, 11761, 393, 597, 584, 3430, 393, 2137, 366, 764, 2102, 837, 257, 1178, 1528, 1568, 837, 649, 3136, 5220, 5234, 25125, 3713, 550, 4987, 257, 1730, 351, 11761, 329, 27663, 2162, 262, 6838, 373, 7440, 8167, 284, 307, 4248, 1679, 1510, 351, 20894, 16364, 29690, 3867, 284, 5234, 25125, 3713, 287, 257, 4553, 4351, 1730, 764, 1550, 1542, 2795, 837, 5234, 25125, 3713, 3414, 257, 1730, 284, 1051, 9500, 1114, 75, 21162, 422, 9757, 283, 5305, 837, 287, 644, 373, 1775, 355, 257, 1445, 284, 6330, 27663, 878, 465, 12928, 2627, 1743, 764, 1550, 362, 2901, 837, 340, 373, 2098, 326, 27663, 550, 2005, 1790, 257, 14600, 284, 6129, 736, 284, 14708, 284, 2457, 786, 262, 1445, 284, 11761, 764, 383, 1708, 1110, 837, 27663, 3804, 257, 3315, 379, 11761, 705, 82, 5616, 3822, 3047, 2323, 764, 679, 2714, 257, 1803, 4495, 287, 14708, 319, 604, 2901, 284, 8406, 33458, 284, 262, 5234, 25125, 3713, 3296, 837, 878, 14339, 465, 1445, 284, 11761, 319, 257, 2237, 2488, 12, 31, 614, 2775, 764, 383, 4351, 6838, 373, 262, 4511, 287, 11761, 705, 82, 2106, 764, 554, 2805, 3648, 837, 4706, 31918, 3932, 8836, 660, 89, 5081, 287, 281, 2720, 351, 383, 3782, 326, 27663, 373, 9477, 329, 1088, 4248, 1160, 1510, 837, 3584, 428, 3785, 2753, 656, 1848, 16364, 29690, 705, 82, 1445, 284, 5234, 25125, 3713, 764, 220, 198, 796, 796, 796, 11761, 796, 796, 796, 220, 198, 796, 796, 796, 796, 4343, 784, 8487, 1622, 796, 796, 796, 796, 220, 198, 27663, 925, 465, 7606, 8886, 329, 11761, 1028, 33644, 25018, 287, 257, 362, 784, 352, 1592, 319, 1367, 2932, 4343, 764, 679, 925, 465, 717, 5585, 287, 262, 40858, 6662, 4041, 1440, 1528, 1613, 465, 717, 2854, 287, 257, 352, 784, 657, 5373, 625, 309, 2852, 1076, 837, 706, 2406, 319, 355, 257, 9225, 400, 5664, 15373, 764, 2399, 717, 9952, 4041, 3061, 1625, 319, 465, 44303, 8886, 319, 678, 2932, 4343, 837, 287, 262, 1467, 400, 5664, 287, 257, 352, 784, 352, 3197, 1028, 12147, 764, 2399, 717, 6877, 2488, 12, 31, 6908, 1625, 287, 257, 604, 784, 362, 5373, 625, 11725, 287, 262, 4041, 5454, 319, 1679, 2693, 4343, 837, 351, 477, 286, 465, 4661, 2406, 287, 262, 1218, 2063, 764, 2399, 717, 4661, 287, 262, 6662, 4041, 1625, 319, 465, 2368, 5585, 287, 262, 5449, 355, 11761, 4405, 4347, 78, 604, 784, 352, 319, 2579, 3389, 4343, 837, 355, 339, 7781, 5403, 764, 220, 198, 27663, 373, 3706, 262, 9952, 4041, 7853, 286, 262, 16061, 329, 3945, 3648, 837, 1141, 543, 339, 7781, 1440, 4661, 287, 1440, 11057, 837, 1390, 257, 6877, 2488, 12, 31, 6908, 1028, 46050, 65, 740, 319, 2242, 3945, 3648, 764, 770, 6877, 2488, 12, 31, 6908, 290, 1194, 287, 257, 604, 784, 657, 5373, 625, 2688, 4345, 1578, 319, 642, 2805, 3648, 4001, 339, 2627, 262, 717, 11761, 2137, 1201, 3619, 8528, 647, 287, 3389, 22717, 284, 4776, 257, 6877, 2488, 12, 31, 6908, 287, 25175, 1363, 7466, 764, 11450, 287, 2805, 837, 706, 339, 7781, 257, 6298, 400, 2488, 12, 31, 5664, 13639, 1028, 11725, 379, 44303, 837, 5033, 262, 717, 11761, 2137, 1201, 34855, 33693, 287, 262, 8735, 784, 9907, 1622, 284, 4776, 1160, 4652, 4661, 287, 257, 1622, 764, 554, 3035, 837, 339, 7781, 1194, 6662, 4041, 3061, 837, 428, 640, 1028, 13837, 287, 262, 3860, 2488, 12, 31, 2457, 1218, 1232, 837, 355, 11761, 6190, 284, 262, 10663, 2488, 12, 31, 2457, 764, 770, 3061, 1718, 683, 4291, 2808, 4661, 329, 262, 4343, 784, 8487, 1622, 287, 477, 24174, 837, 39097, 278, 3899, 22605, 705, 82, 2614, 1700, 329, 4661, 287, 257, 1622, 764, 1550, 1367, 3035, 3648, 837, 340, 373, 3414, 27663, 550, 925, 257, 2237, 2488, 12, 31, 582, 1790, 4868, 329, 262, 350, 7708, 13094, 705, 7853, 286, 262, 6280, 5764, 837, 543, 373, 4191, 1839, 416, 24568, 10115, 36309, 286, 9502, 1578, 764, 383, 7897, 3230, 373, 635, 19332, 329, 262, 350, 7708, 6960, 7853, 286, 262, 6280, 11289, 837, 543, 373, 1839, 416, 327, 3798, 376, 24247, 65, 2301, 292, 286, 13837, 290, 373, 3706, 287, 262, 350, 7708, 4816, 286, 262, 6280, 764, 554, 1737, 837, 339, 5201, 1218, 284, 36309, 329, 262, 376, 15543, 9957, 263, 286, 262, 6280, 5764, 764, 220, 198, 1550, 604, 1737, 3648, 837, 27663, 7781, 257, 7632, 400, 2488, 12, 31, 5664, 8464, 1028, 9502, 2254, 837, 543, 1602, 4262, 262, 12785, 44303, 4652, 3061, 1700, 286, 3624, 1830, 900, 416, 13637, 12937, 764, 2293, 9689, 465, 1987, 400, 4652, 3061, 287, 262, 2457, 983, 286, 262, 1622, 837, 257, 362, 784, 657, 1592, 1028, 24272, 41239, 14225, 837, 339, 900, 257, 649, 1700, 329, 262, 749, 28892, 3215, 3061, 30664, 287, 257, 8886, 1622, 287, 4492, 837, 39097, 278, 11667, 463, 5719, 399, 396, 417, 305, 726, 705, 82, 2242, 4661, 764, 679, 4444, 262, 1622, 287, 6466, 1218, 1295, 351, 32390, 1215, 1765, 323, 273, 287, 262, 3234, 329, 262, 9952, 4041, 10861, 6297, 764, 27663, 373, 2426, 284, 2056, 13367, 326, 12147, 547, 4684, 284, 1414, 4248, 2026, 1510, 284, 1051, 683, 475, 27663, 7082, 416, 2282, 340, 561, 307, 366, 867, 812, 366, 878, 339, 1364, 11761, 764, 11761, 763, 2488, 12, 31, 4870, 4186, 31750, 635, 2469, 515, 262, 2126, 286, 257, 4351, 837, 2282, 339, 561, 407, 1249, 27663, 284, 2666, 262, 3430, 379, 597, 2756, 764, 220, 198, 796, 796, 796, 796, 3648, 784, 7769, 1622, 796, 796, 796, 796, 220, 198, 27663, 925, 257, 9689, 923, 284, 262, 3648, 784, 7769, 9952, 4041, 1622, 351, 257, 1679, 12699, 2823, 656, 262, 4220, 826, 2488, 12, 31, 1021, 5228, 1497, 379, 35706, 837, 543, 373, 262, 691, 3061, 287, 257, 352, 784, 657, 1592, 319, 1467, 2932, 3648, 764, 679, 6989, 257, 36744, 11626, 287, 257, 657, 784, 657, 3197, 1028, 33644, 25018, 837, 543, 561, 1394, 683, 503, 329, 734, 284, 1115, 2745, 764, 27663, 925, 465, 1441, 287, 257, 362, 784, 352, 5373, 1028, 45996, 8270, 287, 262, 6662, 4041, 290, 1816, 319, 284, 4776, 734, 4661], 'attention_mask': [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1], 'labels': [475, 26532, 10872, 5081, 366, 775, 705, 303, 2722, 645, 2897, 422, 11761, 393, 597, 584, 3430, 393, 2137, 366, 764, 2102, 837, 257, 1178, 1528, 1568, 837, 649, 3136, 5220, 5234, 25125, 3713, 550, 4987, 257, 1730, 351, 11761, 329, 27663, 2162, 262, 6838, 373, 7440, 8167, 284, 307, 4248, 1679, 1510, 351, 20894, 16364, 29690, 3867, 284, 5234, 25125, 3713, 287, 257, 4553, 4351, 1730, 764, 1550, 1542, 2795, 837, 5234, 25125, 3713, 3414, 257, 1730, 284, 1051, 9500, 1114, 75, 21162, 422, 9757, 283, 5305, 837, 287, 644, 373, 1775, 355, 257, 1445, 284, 6330, 27663, 878, 465, 12928, 2627, 1743, 764, 1550, 362, 2901, 837, 340, 373, 2098, 326, 27663, 550, 2005, 1790, 257, 14600, 284, 6129, 736, 284, 14708, 284, 2457, 786, 262, 1445, 284, 11761, 764, 383, 1708, 1110, 837, 27663, 3804, 257, 3315, 379, 11761, 705, 82, 5616, 3822, 3047, 2323, 764, 679, 2714, 257, 1803, 4495, 287, 14708, 319, 604, 2901, 284, 8406, 33458, 284, 262, 5234, 25125, 3713, 3296, 837, 878, 14339, 465, 1445, 284, 11761, 319, 257, 2237, 2488, 12, 31, 614, 2775, 764, 383, 4351, 6838, 373, 262, 4511, 287, 11761, 705, 82, 2106, 764, 554, 2805, 3648, 837, 4706, 31918, 3932, 8836, 660, 89, 5081, 287, 281, 2720, 351, 383, 3782, 326, 27663, 373, 9477, 329, 1088, 4248, 1160, 1510, 837, 3584, 428, 3785, 2753, 656, 1848, 16364, 29690, 705, 82, 1445, 284, 5234, 25125, 3713, 764, 220, 198, 796, 796, 796, 11761, 796, 796, 796, 220, 198, 796, 796, 796, 796, 4343, 784, 8487, 1622, 796, 796, 796, 796, 220, 198, 27663, 925, 465, 7606, 8886, 329, 11761, 1028, 33644, 25018, 287, 257, 362, 784, 352, 1592, 319, 1367, 2932, 4343, 764, 679, 925, 465, 717, 5585, 287, 262, 40858, 6662, 4041, 1440, 1528, 1613, 465, 717, 2854, 287, 257, 352, 784, 657, 5373, 625, 309, 2852, 1076, 837, 706, 2406, 319, 355, 257, 9225, 400, 5664, 15373, 764, 2399, 717, 9952, 4041, 3061, 1625, 319, 465, 44303, 8886, 319, 678, 2932, 4343, 837, 287, 262, 1467, 400, 5664, 287, 257, 352, 784, 352, 3197, 1028, 12147, 764, 2399, 717, 6877, 2488, 12, 31, 6908, 1625, 287, 257, 604, 784, 362, 5373, 625, 11725, 287, 262, 4041, 5454, 319, 1679, 2693, 4343, 837, 351, 477, 286, 465, 4661, 2406, 287, 262, 1218, 2063, 764, 2399, 717, 4661, 287, 262, 6662, 4041, 1625, 319, 465, 2368, 5585, 287, 262, 5449, 355, 11761, 4405, 4347, 78, 604, 784, 352, 319, 2579, 3389, 4343, 837, 355, 339, 7781, 5403, 764, 220, 198, 27663, 373, 3706, 262, 9952, 4041, 7853, 286, 262, 16061, 329, 3945, 3648, 837, 1141, 543, 339, 7781, 1440, 4661, 287, 1440, 11057, 837, 1390, 257, 6877, 2488, 12, 31, 6908, 1028, 46050, 65, 740, 319, 2242, 3945, 3648, 764, 770, 6877, 2488, 12, 31, 6908, 290, 1194, 287, 257, 604, 784, 657, 5373, 625, 2688, 4345, 1578, 319, 642, 2805, 3648, 4001, 339, 2627, 262, 717, 11761, 2137, 1201, 3619, 8528, 647, 287, 3389, 22717, 284, 4776, 257, 6877, 2488, 12, 31, 6908, 287, 25175, 1363, 7466, 764, 11450, 287, 2805, 837, 706, 339, 7781, 257, 6298, 400, 2488, 12, 31, 5664, 13639, 1028, 11725, 379, 44303, 837, 5033, 262, 717, 11761, 2137, 1201, 34855, 33693, 287, 262, 8735, 784, 9907, 1622, 284, 4776, 1160, 4652, 4661, 287, 257, 1622, 764, 554, 3035, 837, 339, 7781, 1194, 6662, 4041, 3061, 837, 428, 640, 1028, 13837, 287, 262, 3860, 2488, 12, 31, 2457, 1218, 1232, 837, 355, 11761, 6190, 284, 262, 10663, 2488, 12, 31, 2457, 764, 770, 3061, 1718, 683, 4291, 2808, 4661, 329, 262, 4343, 784, 8487, 1622, 287, 477, 24174, 837, 39097, 278, 3899, 22605, 705, 82, 2614, 1700, 329, 4661, 287, 257, 1622, 764, 1550, 1367, 3035, 3648, 837, 340, 373, 3414, 27663, 550, 925, 257, 2237, 2488, 12, 31, 582, 1790, 4868, 329, 262, 350, 7708, 13094, 705, 7853, 286, 262, 6280, 5764, 837, 543, 373, 4191, 1839, 416, 24568, 10115, 36309, 286, 9502, 1578, 764, 383, 7897, 3230, 373, 635, 19332, 329, 262, 350, 7708, 6960, 7853, 286, 262, 6280, 11289, 837, 543, 373, 1839, 416, 327, 3798, 376, 24247, 65, 2301, 292, 286, 13837, 290, 373, 3706, 287, 262, 350, 7708, 4816, 286, 262, 6280, 764, 554, 1737, 837, 339, 5201, 1218, 284, 36309, 329, 262, 376, 15543, 9957, 263, 286, 262, 6280, 5764, 764, 220, 198, 1550, 604, 1737, 3648, 837, 27663, 7781, 257, 7632, 400, 2488, 12, 31, 5664, 8464, 1028, 9502, 2254, 837, 543, 1602, 4262, 262, 12785, 44303, 4652, 3061, 1700, 286, 3624, 1830, 900, 416, 13637, 12937, 764, 2293, 9689, 465, 1987, 400, 4652, 3061, 287, 262, 2457, 983, 286, 262, 1622, 837, 257, 362, 784, 657, 1592, 1028, 24272, 41239, 14225, 837, 339, 900, 257, 649, 1700, 329, 262, 749, 28892, 3215, 3061, 30664, 287, 257, 8886, 1622, 287, 4492, 837, 39097, 278, 11667, 463, 5719, 399, 396, 417, 305, 726, 705, 82, 2242, 4661, 764, 679, 4444, 262, 1622, 287, 6466, 1218, 1295, 351, 32390, 1215, 1765, 323, 273, 287, 262, 3234, 329, 262, 9952, 4041, 10861, 6297, 764, 27663, 373, 2426, 284, 2056, 13367, 326, 12147, 547, 4684, 284, 1414, 4248, 2026, 1510, 284, 1051, 683, 475, 27663, 7082, 416, 2282, 340, 561, 307, 366, 867, 812, 366, 878, 339, 1364, 11761, 764, 11761, 763, 2488, 12, 31, 4870, 4186, 31750, 635, 2469, 515, 262, 2126, 286, 257, 4351, 837, 2282, 339, 561, 407, 1249, 27663, 284, 2666, 262, 3430, 379, 597, 2756, 764, 220, 198, 796, 796, 796, 796, 3648, 784, 7769, 1622, 796, 796, 796, 796, 220, 198, 27663, 925, 257, 9689, 923, 284, 262, 3648, 784, 7769, 9952, 4041, 1622, 351, 257, 1679, 12699, 2823, 656, 262, 4220, 826, 2488, 12, 31, 1021, 5228, 1497, 379, 35706, 837, 543, 373, 262, 691, 3061, 287, 257, 352, 784, 657, 1592, 319, 1467, 2932, 3648, 764, 679, 6989, 257, 36744, 11626, 287, 257, 657, 784, 657, 3197, 1028, 33644, 25018, 837, 543, 561, 1394, 683, 503, 329, 734, 284, 1115, 2745, 764, 27663, 925, 465, 1441, 287, 257, 362, 784, 352, 5373, 1028, 45996, 8270, 287, 262, 6662, 4041, 290, 1816, 319, 284, 4776, 734, 4661]}.
06/22/2023 13:41:07 - INFO - __main__ - Sample 1410 of the training set: {'input_ids': [3226, 262, 734, 10756, 1127, 837, 262, 2284, 4563, 3568, 1266, 1498, 284, 14561, 257, 11622, 15974, 21404, 290, 4833, 15974, 764, 220, 198, 1081, 351, 597, 30135, 379, 393, 1474, 262, 1353, 286, 663, 2057, 6333, 837, 262, 2284, 4563, 12751, 262, 3265, 286, 15974, 4693, 764, 14322, 341, 416, 2284, 25821, 468, 587, 6692, 284, 2458, 287, 262, 4693, 5022, 286, 20096, 287, 257, 3814, 764, 1114, 1672, 837, 257, 2050, 287, 3517, 9309, 6515, 326, 262, 3265, 286, 285, 2261, 20096, 837, 257, 19344, 2284, 4563, 15974, 837, 373, 18080, 981, 262, 3265, 286, 262, 1342, 6777, 15974, 276, 2488, 12, 31, 2402, 2330, 2488, 12, 31, 256, 6255, 20096, 373, 3649, 764, 383, 10930, 5451, 1667, 27926, 837, 281, 22700, 4693, 42560, 284, 530, 3814, 286, 15715, 2284, 4563, 3265, 837, 468, 1775, 11832, 3146, 2233, 284, 2284, 4563, 290, 12768, 17481, 2747, 341, 764, 15933, 837, 612, 318, 257, 40757, 1245, 319, 262, 3081, 286, 20096, 9684, 416, 279, 7487, 2747, 341, 764, 220, 198, 554, 262, 8372, 636, 286, 2520, 2253, 837, 262, 279, 7487, 318, 257, 1353, 1241, 30135, 326, 468, 6856, 262, 3265, 286, 915, 272, 10602, 290, 584, 4693, 1201, 44741, 1661, 764, 220, 198, 796, 796, 6707, 40637, 796, 796, 220, 198, 317, 279, 388, 499, 446, 318, 257, 14554, 5044, 7186, 422, 257, 6441, 1022, 257, 2284, 4563, 290, 257, 443, 15478, 764, 7683, 5621, 286, 777, 45456, 547, 33592, 287, 262, 2739, 31982, 82, 290, 1903, 21489, 82, 416, 8124, 367, 11286, 27343, 379, 465, 5044, 3952, 287, 32526, 837, 4486, 764, 4042, 750, 407, 3151, 28003, 764, 1881, 286, 777, 373, 8155, 287, 46244, 416, 11307, 21980, 764, 317, 2092, 14554, 287, 11307, 21980, 8155, 422, 367, 11286, 27343, 373, 257, 3272, 1022, 257, 4257, 443, 15478, 290, 257, 4048, 279, 7487, 764, 32526, 21980, 705, 82, 31674, 373, 262, 9575, 27356, 837, 262, 530, 287, 262, 2042, 2488, 12, 31, 290, 2488, 12, 31, 2330, 4590, 837, 3735, 6083, 416, 257, 279, 7487, 33592, 284, 281, 3942, 443, 15478, 408, 764, 220, 198, 10127, 4642, 284, 257, 4048, 279, 7487, 285, 515, 284, 257, 4257, 443, 15478, 837, 393, 284, 257, 4257, 279, 7487, 285, 515, 284, 257, 4048, 443, 15478, 837, 279, 388, 499, 1371, 16955, 257, 1296, 286, 24603, 1042, 764, 5845, 2098, 6348, 284, 691, 2063, 262, 2546, 286, 262, 3397, 764, 1119, 423, 257, 279, 7487, 2488, 12, 31, 588, 890, 1767, 357, 27111, 284, 262, 21755, 837, 475, 19032, 12238, 621, 2035, 2560, 1267, 837, 475, 1790, 7405, 764, 383, 13209, 318, 2972, 306, 3417, 355, 44039, 837, 256, 3832, 88, 393, 13791, 680, 351, 7586, 837, 7721, 14930, 393, 366, 24887, 366, 686, 2617, 4879, 764, 220, 198, 796, 796, 23702, 3722, 796, 796, 220, 198, 383, 2159, 23702, 4479, 357, 314, 9598, 45, 1267, 3058, 8341, 262, 2284, 4563, 355, 257, 366, 1551, 2328, 366, 4693, 764, 383, 2284, 4563, 318, 17153, 739, 30378, 314, 286, 262, 11680, 319, 4037, 9601, 287, 5268, 19041, 28540, 286, 6183, 376, 32837, 290, 4432, 64, 357, 327, 2043, 1546, 1267, 837, 14837, 5293, 3230, 3292, 287, 27569, 393, 3354, 764, 220, 198, 554, 262, 1578, 1829, 7627, 286, 262, 13797, 5866, 837, 262, 691, 47001, 1900, 2284, 4563, 3265, 318, 262, 4744, 15857, 372, 764, 14303, 2813, 837, 262, 1578, 1829, 13388, 290, 18006, 4809, 357, 1294, 37, 19416, 1267, 8018, 1111, 281, 8345, 2284, 4563, 357, 4752, 284, 307, 257, 850, 35448, 416, 617, 837, 6699, 416, 1854, 1267, 290, 262, 4744, 15857, 372, 837, 1527, 1284, 4800, 739, 262, 5268, 19041, 28540, 2191, 764, 16272, 1687, 40036, 4773, 423, 14707, 1111, 1486, 602, 656, 262, 2258, 1605, 2284, 4563, 837, 351, 8345, 393, 4744, 850, 35448, 407, 8018, 837, 981, 257, 850, 35448, 22566, 3793, 8018, 416, 617, 14903, 5519, 764, 554, 5816, 262, 12395, 954, 329, 262, 4744, 850, 2488, 12, 31, 3265, 373, 10083, 3925, 764, 554, 2805, 2813, 837, 262, 1294, 37, 19416, 6875, 262, 8345, 2284, 4563, 28881, 764, 2080, 262, 1687, 40036, 13479, 546, 663, 6224, 355, 257, 850, 35448, 355, 880, 355, 262, 5885, 286, 7627, 904, 13472, 286, 2284, 25821, 422, 262, 8830, 2837, 837, 262, 2426, 3793, 1280, 764, 220, 198, 770, 13479, 468, 587, 8018, 416, 5398, 4773, 764, 383, 5398, 2717, 4086, 1444, 4606, 319, 262, 12678, 286, 5268, 19041, 18006, 287, 3340, 3965, 663, 1459, 1366, 355, 366, 19022, 366, 284, 3197, 13242, 5115, 262, 10183, 2284, 4563, 705, 82, 9441, 837, 290, 1139, 319, 663, 5313, 2524, 366, 7945, 867, 36242, 287, 262, 1613, 734, 4647, 422, 10183, 3340, 837, 612, 389, 19022, 1366, 284, 13446, 262, 1687, 30565, 393, 8333, 257, 3722, 284, 428, 2284, 4563, 764, 366, 43970, 6409, 2098, 36242, 287, 10553, 837, 14778, 837, 968, 32211, 290, 17711, 32586, 837, 340, 468, 587, 531, 326, 262, 2370, 318, 15861, 5731, 1058, 366, 764, 764, 764, 612, 743, 407, 307, 257, 7310, 705, 10183, 705, 850, 35448, 837, 290, 617, 36242, 743, 307, 286, 13537, 17252, 764, 366, 220, 198, 383, 2284, 4563, 318, 635, 6861, 1973, 881, 286, 262, 1334, 286, 663, 2837, 764, 1081, 286, 8235, 837, 2284, 4563, 10988, 373, 12244, 287, 16519, 837, 7595, 837, 38496, 837, 17456, 837, 21291, 837, 18133, 31656, 837, 4141, 1962, 7484, 837, 32183, 837, 32982, 837, 38252, 837, 23519, 837, 47268, 323, 837, 4198, 259, 480, 837, 16666, 837, 290, 36421, 764, 383, 3797, 550, 645, 2098, 2742, 4800, 287, 25794, 837, 2574, 26482, 837, 290, 13145, 2271, 764, 3310, 4817, 2284, 4563, 10988, 318, 991, 2219, 287, 262, 1578, 1829, 290, 3340, 837, 3584, 484, 389, 6861, 422, 477, 10988, 287, 262, 19760, 261, 2162, 340, 318, 10431, 287, 790, 471, 13, 50, 13, 1181, 422, 262, 24534, 21124, 284, 262, 8211, 10692, 837, 351, 262, 6631, 286, 3442, 764, 3936, 318, 262, 691, 1181, 287, 262, 1578, 1829, 351, 257, 13971, 3265, 286, 2284, 25821, 326, 857, 407, 1805, 326, 3265, 287, 617, 835, 764, 554, 3936, 837, 2284, 25821, 389, 5610, 355, 36212, 15599, 290, 597, 1048, 4769, 257, 10988, 393, 257, 43338, 8749, 460, 1494], 'attention_mask': [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1], 'labels': [3226, 262, 734, 10756, 1127, 837, 262, 2284, 4563, 3568, 1266, 1498, 284, 14561, 257, 11622, 15974, 21404, 290, 4833, 15974, 764, 220, 198, 1081, 351, 597, 30135, 379, 393, 1474, 262, 1353, 286, 663, 2057, 6333, 837, 262, 2284, 4563, 12751, 262, 3265, 286, 15974, 4693, 764, 14322, 341, 416, 2284, 25821, 468, 587, 6692, 284, 2458, 287, 262, 4693, 5022, 286, 20096, 287, 257, 3814, 764, 1114, 1672, 837, 257, 2050, 287, 3517, 9309, 6515, 326, 262, 3265, 286, 285, 2261, 20096, 837, 257, 19344, 2284, 4563, 15974, 837, 373, 18080, 981, 262, 3265, 286, 262, 1342, 6777, 15974, 276, 2488, 12, 31, 2402, 2330, 2488, 12, 31, 256, 6255, 20096, 373, 3649, 764, 383, 10930, 5451, 1667, 27926, 837, 281, 22700, 4693, 42560, 284, 530, 3814, 286, 15715, 2284, 4563, 3265, 837, 468, 1775, 11832, 3146, 2233, 284, 2284, 4563, 290, 12768, 17481, 2747, 341, 764, 15933, 837, 612, 318, 257, 40757, 1245, 319, 262, 3081, 286, 20096, 9684, 416, 279, 7487, 2747, 341, 764, 220, 198, 554, 262, 8372, 636, 286, 2520, 2253, 837, 262, 279, 7487, 318, 257, 1353, 1241, 30135, 326, 468, 6856, 262, 3265, 286, 915, 272, 10602, 290, 584, 4693, 1201, 44741, 1661, 764, 220, 198, 796, 796, 6707, 40637, 796, 796, 220, 198, 317, 279, 388, 499, 446, 318, 257, 14554, 5044, 7186, 422, 257, 6441, 1022, 257, 2284, 4563, 290, 257, 443, 15478, 764, 7683, 5621, 286, 777, 45456, 547, 33592, 287, 262, 2739, 31982, 82, 290, 1903, 21489, 82, 416, 8124, 367, 11286, 27343, 379, 465, 5044, 3952, 287, 32526, 837, 4486, 764, 4042, 750, 407, 3151, 28003, 764, 1881, 286, 777, 373, 8155, 287, 46244, 416, 11307, 21980, 764, 317, 2092, 14554, 287, 11307, 21980, 8155, 422, 367, 11286, 27343, 373, 257, 3272, 1022, 257, 4257, 443, 15478, 290, 257, 4048, 279, 7487, 764, 32526, 21980, 705, 82, 31674, 373, 262, 9575, 27356, 837, 262, 530, 287, 262, 2042, 2488, 12, 31, 290, 2488, 12, 31, 2330, 4590, 837, 3735, 6083, 416, 257, 279, 7487, 33592, 284, 281, 3942, 443, 15478, 408, 764, 220, 198, 10127, 4642, 284, 257, 4048, 279, 7487, 285, 515, 284, 257, 4257, 443, 15478, 837, 393, 284, 257, 4257, 279, 7487, 285, 515, 284, 257, 4048, 443, 15478, 837, 279, 388, 499, 1371, 16955, 257, 1296, 286, 24603, 1042, 764, 5845, 2098, 6348, 284, 691, 2063, 262, 2546, 286, 262, 3397, 764, 1119, 423, 257, 279, 7487, 2488, 12, 31, 588, 890, 1767, 357, 27111, 284, 262, 21755, 837, 475, 19032, 12238, 621, 2035, 2560, 1267, 837, 475, 1790, 7405, 764, 383, 13209, 318, 2972, 306, 3417, 355, 44039, 837, 256, 3832, 88, 393, 13791, 680, 351, 7586, 837, 7721, 14930, 393, 366, 24887, 366, 686, 2617, 4879, 764, 220, 198, 796, 796, 23702, 3722, 796, 796, 220, 198, 383, 2159, 23702, 4479, 357, 314, 9598, 45, 1267, 3058, 8341, 262, 2284, 4563, 355, 257, 366, 1551, 2328, 366, 4693, 764, 383, 2284, 4563, 318, 17153, 739, 30378, 314, 286, 262, 11680, 319, 4037, 9601, 287, 5268, 19041, 28540, 286, 6183, 376, 32837, 290, 4432, 64, 357, 327, 2043, 1546, 1267, 837, 14837, 5293, 3230, 3292, 287, 27569, 393, 3354, 764, 220, 198, 554, 262, 1578, 1829, 7627, 286, 262, 13797, 5866, 837, 262, 691, 47001, 1900, 2284, 4563, 3265, 318, 262, 4744, 15857, 372, 764, 14303, 2813, 837, 262, 1578, 1829, 13388, 290, 18006, 4809, 357, 1294, 37, 19416, 1267, 8018, 1111, 281, 8345, 2284, 4563, 357, 4752, 284, 307, 257, 850, 35448, 416, 617, 837, 6699, 416, 1854, 1267, 290, 262, 4744, 15857, 372, 837, 1527, 1284, 4800, 739, 262, 5268, 19041, 28540, 2191, 764, 16272, 1687, 40036, 4773, 423, 14707, 1111, 1486, 602, 656, 262, 2258, 1605, 2284, 4563, 837, 351, 8345, 393, 4744, 850, 35448, 407, 8018, 837, 981, 257, 850, 35448, 22566, 3793, 8018, 416, 617, 14903, 5519, 764, 554, 5816, 262, 12395, 954, 329, 262, 4744, 850, 2488, 12, 31, 3265, 373, 10083, 3925, 764, 554, 2805, 2813, 837, 262, 1294, 37, 19416, 6875, 262, 8345, 2284, 4563, 28881, 764, 2080, 262, 1687, 40036, 13479, 546, 663, 6224, 355, 257, 850, 35448, 355, 880, 355, 262, 5885, 286, 7627, 904, 13472, 286, 2284, 25821, 422, 262, 8830, 2837, 837, 262, 2426, 3793, 1280, 764, 220, 198, 770, 13479, 468, 587, 8018, 416, 5398, 4773, 764, 383, 5398, 2717, 4086, 1444, 4606, 319, 262, 12678, 286, 5268, 19041, 18006, 287, 3340, 3965, 663, 1459, 1366, 355, 366, 19022, 366, 284, 3197, 13242, 5115, 262, 10183, 2284, 4563, 705, 82, 9441, 837, 290, 1139, 319, 663, 5313, 2524, 366, 7945, 867, 36242, 287, 262, 1613, 734, 4647, 422, 10183, 3340, 837, 612, 389, 19022, 1366, 284, 13446, 262, 1687, 30565, 393, 8333, 257, 3722, 284, 428, 2284, 4563, 764, 366, 43970, 6409, 2098, 36242, 287, 10553, 837, 14778, 837, 968, 32211, 290, 17711, 32586, 837, 340, 468, 587, 531, 326, 262, 2370, 318, 15861, 5731, 1058, 366, 764, 764, 764, 612, 743, 407, 307, 257, 7310, 705, 10183, 705, 850, 35448, 837, 290, 617, 36242, 743, 307, 286, 13537, 17252, 764, 366, 220, 198, 383, 2284, 4563, 318, 635, 6861, 1973, 881, 286, 262, 1334, 286, 663, 2837, 764, 1081, 286, 8235, 837, 2284, 4563, 10988, 373, 12244, 287, 16519, 837, 7595, 837, 38496, 837, 17456, 837, 21291, 837, 18133, 31656, 837, 4141, 1962, 7484, 837, 32183, 837, 32982, 837, 38252, 837, 23519, 837, 47268, 323, 837, 4198, 259, 480, 837, 16666, 837, 290, 36421, 764, 383, 3797, 550, 645, 2098, 2742, 4800, 287, 25794, 837, 2574, 26482, 837, 290, 13145, 2271, 764, 3310, 4817, 2284, 4563, 10988, 318, 991, 2219, 287, 262, 1578, 1829, 290, 3340, 837, 3584, 484, 389, 6861, 422, 477, 10988, 287, 262, 19760, 261, 2162, 340, 318, 10431, 287, 790, 471, 13, 50, 13, 1181, 422, 262, 24534, 21124, 284, 262, 8211, 10692, 837, 351, 262, 6631, 286, 3442, 764, 3936, 318, 262, 691, 1181, 287, 262, 1578, 1829, 351, 257, 13971, 3265, 286, 2284, 25821, 326, 857, 407, 1805, 326, 3265, 287, 617, 835, 764, 554, 3936, 837, 2284, 25821, 389, 5610, 355, 36212, 15599, 290, 597, 1048, 4769, 257, 10988, 393, 257, 43338, 8749, 460, 1494]}.
06/22/2023 13:41:07 - WARNING - accelerate.accelerator - FSDP Warning: When using FSDP, it is efficient and recommended to call prepare for the model before creating the optimizer
06/22/2023 13:41:09 - WARNING - accelerate.accelerator - FSDP Warning: When using FSDP, several parameter groups will be conflated into a single one due to nested module wrapping and parameter flattening.
06/22/2023 13:41:09 - INFO - __main__ - ***** Running training *****
06/22/2023 13:41:09 - INFO - __main__ - Num examples = 2318
06/22/2023 13:41:09 - INFO - __main__ - Num Epochs = 3
06/22/2023 13:41:09 - INFO - __main__ - Instantaneous batch size per device = 1
06/22/2023 13:41:09 - INFO - __main__ - Total train batch size (w. parallel, distributed & accumulation) = 16
06/22/2023 13:41:09 - INFO - __main__ - Gradient Accumulation steps = 8
06/22/2023 13:41:09 - INFO - __main__ - Total optimization steps = 435
13%|███████████▋ | 57/435 [07:58<53:22, 8.47s/it]
However, it took 76058MiB of VRAM
using the main branch of accelerate
with 4 GPUs, seeing 62264MB of VRAM usage
Sorry for hijacking this but I am getting an OOM error on using 8 A100 40GB GPUs. Since the total GPU memory is more than your setup I am curious why it is not working!
Here is my accelerate env
- `Accelerate` version: 0.20.3
- Platform: Linux-4.19.0-22-cloud-amd64-x86_64-with-glibc2.28
- Python version: 3.9.16
- Numpy version: 1.25.0
- PyTorch version (GPU?): 2.0.1+cu117 (True)
- PyTorch XPU available: False
- System RAM: 669.27 GB
- GPU type: NVIDIA A100-SXM4-40GB
- `Accelerate` default config:
- compute_environment: LOCAL_MACHINE
- distributed_type: FSDP
- mixed_precision: bf16
- use_cpu: False
- num_processes: 8
- machine_rank: 0
- num_machines: 1
- rdzv_backend: static
- same_network: True
- main_training_function: main
- fsdp_config: {'fsdp_auto_wrap_policy': 'TRANSFORMER_BASED_WRAP', 'fsdp_backward_prefetch_policy': 'BACKWARD_PRE', 'fsdp_offload_params': False, 'fsdp_sharding_strategy': 1, 'fsdp_state_dict_type': 'SHARDED_STATE_DICT', 'fsdp_transformer_layer_cls_to_wrap': 'GPTJBlock'}
- downcast_bf16: no
- tpu_use_cluster: False
- tpu_use_sudo: False
- tpu_env: []
This is the command I used
accelerate launch run_clm_no_trainer.py --dataset_name wikitext --per_device_train_batch_size 1 --per_device_eval_batch_size 1 --gradient_accumulation_steps 8 --dataset_config_name wikitext-2-raw-v1 --model_name_or_path EleutherAI/gpt-j-6b --output_dir /tmp/test-clm --low_cpu_mem_usage
Hello, I believe that large sequence lengths during clm modeling (1024 default I guess) is leading to this behaviour. Could you try applying gradient checkpointing following this: https://github.com/lessw2020/transformer_central/tree/main/activation_checkpointing_tutorial.
This should greatly reduce the Memory usage at the cost of speed.
Thank you very much for the quick response!
I used run_clm.py with --gradient_checkpointing True and it seems to work for GPTJ with block_size 1024! However, it does not work for Salesforce/CodeGen-Mono-6B and also doesn't work if I increase the block_size to 2048. I am (for now) able to run my job using deepspeed zero stage 3. Is this expected even with gradient_checkpointing?
Assuming yes, why does FSDP not work here? Is it because cpu-offloading is disabled (which currently hangs to job in FSDP mixed-precision)?
Finally, I wanted to gain some understand about the setting, is there any resource for understanding the performance impact of different configurations here
- using deepspeed vs fsdp
- 4 A100-80GB vs 8 A100-40GB
and will the answer depend on block_size and model_size?
Happy to migrate this to the forum!
Please move this to forum along with all the details like versions of libraries, minimal reproducible example with the command being run along with configs
Created a post at https://discuss.huggingface.co/t/fsdp-oom-issue-and-comparision-to-deepspeed/44292/1