accelerate
accelerate copied to clipboard
Encountering raise ValueError("Integer parameters are unsupported") when using FSDP and load_in_8bit=True
System Info
- `Accelerate` version: 0.18.0
- Platform: Linux-6.1.24-x86_64-with-glibc2.37
- Python version: 3.10.10
- Numpy version: 1.24.3
- PyTorch version (GPU?): 1.13.1+cu117 (True)
- `Accelerate` default config:
- compute_environment: LOCAL_MACHINE
- distributed_type: FSDP
- mixed_precision: fp16
- use_cpu: False
- num_processes: 4
- machine_rank: 0
- num_machines: 4
- main_process_ip: 0.0.0.0
- main_process_port: 8080
- rdzv_backend: static
- same_network: True
- main_training_function: main
- fsdp_config: {'fsdp_auto_wrap_policy': 'TRANSFORMER_BASED_WRAP', 'fsdp_backward_prefetch_policy': 'BACKWARD_PRE', 'fsdp_offload_params': False, 'fsdp_sharding_strategy': 1, 'fsdp_state_dict_type': 'FULL_STATE_DICT', 'fsdp_transformer_layer_cls_to_wrap': 'T5Block'}
- downcast_bf16: no
- tpu_use_cluster: False
- tpu_use_sudo: False
- tpu_env: []
Information
- [ ] The official example scripts
- [X] My own modified scripts
Tasks
- [ ] One of the scripts in the examples/ folder of Accelerate or an officially supported
no_trainerscript in theexamplesfolder of thetransformersrepo (such asrun_no_trainer_glue.py) - [X] My own task or dataset (give details below)
Reproduction
When trying to launch my training job using the accelerate launcher using:
accelerate launch train.py
My python script is as follows:
from accelerate import Accelerator
from transformers import (
T5ForConditionalGeneration,
)
MODEL_PATH = "google/flan-t5-small"
def train():
model_name_or_path = MODEL_PATH
model = T5ForConditionalGeneration.from_pretrained(
model_name_or_path,
device_map="auto",
load_in_8bit=True,
)
accelerator = Accelerator()
model = accelerator.prepare(model)
if __name__ == "__main__":
train()
The full stacktrace is as follows:
Overriding torch_dtype=None with `torch_dtype=torch.float16` due to requirements of `bitsandbytes` to enable model loading in mixed int8. Either pass torch_dtype=torch.float16 or don't pass this argument at all to remove this warning.
===================================BUG REPORT===================================
Welcome to bitsandbytes. For bug reports, please run
python -m bitsandbytes
and submit this information together with your error trace to: https://github.com/TimDettmers/bitsandbytes/issues
================================================================================
bin /home/markh/text-fine-tuning-experiments/.venv/lib/python3.10/site-packages/bitsandbytes/libbitsandbytes_cuda117.so
CUDA SETUP: CUDA runtime path found: /nix/store/0781hi5c3vb0v7h0s701adqgg4531qib-cuda-home/lib/libcudart.so.11.0
CUDA SETUP: Highest compute capability among GPUs detected: 7.5
CUDA SETUP: Detected CUDA version 117
CUDA SETUP: Loading binary /home/markh/text-fine-tuning-experiments/.venv/lib/python3.10/site-packages/bitsandbytes/libbitsandbytes_cuda117.so...
Traceback (most recent call last):
File "/home/markh/text-fine-tuning-experiments/./finetune/issue.py", line 21, in <module>
train()
File "/home/markh/text-fine-tuning-experiments/./finetune/issue.py", line 17, in train
model = accelerator.prepare(model)
File "/home/markh/text-fine-tuning-experiments/.venv/lib/python3.10/site-packages/accelerate/accelerator.py", line 1122, in prepare
result = tuple(
File "/home/markh/text-fine-tuning-experiments/.venv/lib/python3.10/site-packages/accelerate/accelerator.py", line 1123, in <genexpr>
self._prepare_one(obj, first_pass=True, device_placement=d) for obj, d in zip(args, device_placement)
File "/home/markh/text-fine-tuning-experiments/.venv/lib/python3.10/site-packages/accelerate/accelerator.py", line 977, in _prepare_one
return self.prepare_model(obj, device_placement=device_placement)
File "/home/markh/text-fine-tuning-experiments/.venv/lib/python3.10/site-packages/accelerate/accelerator.py", line 1227, in prepare_model
model = FSDP(model, **kwargs)
File "/home/markh/text-fine-tuning-experiments/.venv/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 1036, in __init__
self._auto_wrap(auto_wrap_kwargs, fsdp_kwargs)
File "/home/markh/text-fine-tuning-experiments/.venv/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 1291, in _auto_wrap
_recursive_wrap(**auto_wrap_kwargs, **fsdp_kwargs)
File "/home/markh/text-fine-tuning-experiments/.venv/lib/python3.10/site-packages/torch/distributed/fsdp/wrap.py", line 403, in _recursive_wrap
wrapped_child, num_wrapped_params = _recursive_wrap(
File "/home/markh/text-fine-tuning-experiments/.venv/lib/python3.10/site-packages/torch/distributed/fsdp/wrap.py", line 403, in _recursive_wrap
wrapped_child, num_wrapped_params = _recursive_wrap(
File "/home/markh/text-fine-tuning-experiments/.venv/lib/python3.10/site-packages/torch/distributed/fsdp/wrap.py", line 403, in _recursive_wrap
wrapped_child, num_wrapped_params = _recursive_wrap(
File "/home/markh/text-fine-tuning-experiments/.venv/lib/python3.10/site-packages/torch/distributed/fsdp/wrap.py", line 421, in _recursive_wrap
return _wrap(module, wrapper_cls, **kwargs), num_params
File "/home/markh/text-fine-tuning-experiments/.venv/lib/python3.10/site-packages/torch/distributed/fsdp/wrap.py", line 350, in _wrap
return wrapper_cls(module, **kwargs)
File "/home/markh/text-fine-tuning-experiments/.venv/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 1079, in __init__
self._fsdp_wrapped_module = FlattenParamsWrapper(
File "/home/markh/text-fine-tuning-experiments/.venv/lib/python3.10/site-packages/torch/distributed/fsdp/flatten_params_wrapper.py", line 103, in __init__
self._flat_param_handle = FlatParamHandle(params, module, device, config)
File "/home/markh/text-fine-tuning-experiments/.venv/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 270, in __init__
self._init_flat_param(params, module)
File "/home/markh/text-fine-tuning-experiments/.venv/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 335, in _init_flat_param
raise ValueError("Integer parameters are unsupported")
ValueError: Integer parameters are unsupported
ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 1) local_rank: 0 (pid: 100044) of binary: /home/markh/text-fine-tuning-experiments/.venv/bin/python
Traceback (most recent call last):
File "/home/markh/text-fine-tuning-experiments/.devenv/state/venv/bin/accelerate", line 8, in <module>
sys.exit(main())
File "/home/markh/text-fine-tuning-experiments/.venv/lib/python3.10/site-packages/accelerate/commands/accelerate_cli.py", line 45, in main
args.func(args)
File "/home/markh/text-fine-tuning-experiments/.venv/lib/python3.10/site-packages/accelerate/commands/launch.py", line 910, in launch_command
multi_gpu_launcher(args)
File "/home/markh/text-fine-tuning-experiments/.venv/lib/python3.10/site-packages/accelerate/commands/launch.py", line 603, in multi_gpu_launcher
distrib_run.run(args)
File "/home/markh/text-fine-tuning-experiments/.venv/lib/python3.10/site-packages/torch/distributed/run.py", line 753, in run
elastic_launch(
File "/home/markh/text-fine-tuning-experiments/.venv/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 132, in __call__
return launch_agent(self._config, self._entrypoint, list(args))
File "/home/markh/text-fine-tuning-experiments/.venv/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 246, in launch_agent
raise ChildFailedError(
torch.distributed.elastic.multiprocessing.errors.ChildFailedError:
============================================================
./finetune/issue.py FAILED
------------------------------------------------------------
Failures:
<NO_OTHER_FAILURES>
------------------------------------------------------------
Root Cause (first observed failure):
[0]:
time : 2023-05-12_00:10:08
host : markh-dev-server-gpu-1a.us-central1-a.c.ml-solutions-371721.inte
rank : 0 (local_rank: 0)
exitcode : 1 (pid: 100044)
error_file: <N/A>
traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html
============================================================
Expected behavior
accelerator.prepare should just create a wrapped model.
cc @younesbelkada
hi @markhng525
This is not supported as pure int8 training is not supported - you may want to check how to train adapters on top of the model. I advise you to check the int8 training examples on peft library here: https://github.com/huggingface/peft/tree/main/examples/int8_training to check how to precisely do that
Therefore you should first wrap your model into a PeftModel and call prepare afterwards. However, I am unsure if PeftModel + int8 is supported under FSDP - let us know how it goes
Same error even after I update my script with get_peft_model
from accelerate import Accelerator
from transformers import (
T5ForConditionalGeneration,
)
MODEL_PATH = "google/flan-t5-small"
def train():
model_name_or_path = MODEL_PATH
model = T5ForConditionalGeneration.from_pretrained(
model_name_or_path,
device_map="auto",
load_in_8bit=True,
)
peft_config = LoraConfig(
task_type=TaskType.SEQ_2_SEQ_LM,
inference_mode=False,
r=8,
lora_alpha=32,
lora_dropout=0.1,
)
model = get_peft_model(model, peft_config)
accelerator = Accelerator()
model = accelerator.prepare(model)
if __name__ == "__main__":
train()
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.
Please note that issues that do not follow the contributing guidelines are likely to be ignored.